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ABSTRACT 

This  thesis  explores  dynamic  reconfiguration  and  link  fault  tolerance  in  a  Transputer 
network  using  software  controlled  crossbars.  A  message  exchange  system  was  designed, 
implemented  and  evaluated  to  facilitate  testing  various  aspects  of  dynamic  interconnec- 
tivity  between  processing  nodes  as  well  as  detection  and  recovery  from  failed  network 
links  without  loss  of  data.  As  implemented,  the  message  exchange  can  be  embedded  with 
iqjplication  code  which  can  direct  network  topology  to  facilitate  code  execution. 
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I.    INTRODUCTION 

A.  BACKGROUND 

Computer  technologies  continue  to  evolve  along  a  variety  of  paths,  each  with  an 
accompanying  set  of  advantages  and  disadvantages.  Typically,  the  primary  design  goal 
in  the  development  of  new,  superior  computer  is  to  achieve  processing  performance 
beyond  what  is  currently  available  in  a  given  cost  range.  The  speed  with  which  the 
assigned  task  is  accomplished  is  a  key  measure  of  a  computer's  ability. 

1.  Single  High  Speed  Processor 

One  path  toward  this  goal  concentrates  efforts  on  improving  the  abilities  of  an 
individual  processor:  the  device  at  the  heart  of  most  computers.  Vast  improvements  have 
been  realized  in  improving  the  processor  architecture  and  shrinking  the  dimensions  of  the 
components  to  reduce  the  time  needed  to  physically  move  control  and  data  signals  from 
one  point  to  another  within  the  processor.  Even  electrons  moving  at  near  the  speed  of 
light  require  a  finite  time  to  travel  finite  distances,  thus  imposing  an  upper  bound  upon 
maximum  obtainable  performance.  More  improvements  in  this  approach  will  no  doubt 
occur,  however,  they  will  be  realized  with  ever  greater  expense  for  each  small  increase  of 
processing  speed. 

2.  Multiple  Processors  Using  A  Shared  Bus 

Another  approach  involves  dividing  the  task  among  multiple  processors,  each 
assigned  a  portion  of  the  task.  This  allows  processing  of  different  pieces  of  the  task  to 
occur  simultaneously  or  in  parallel.  Ideally,  the  time  required  to  complete  a  given  task 
that  is  divided  in  such  a  manner  should  decrease  proportionally  with  the  number  of 
processors  assigned.  That  is,  doubling  the  number  of  processors  should  result  in  reducing 


the  time  required  to  complete  a  given  task  by  half.  In  practice,  however,  the  addition  of 
processors  does  not  result  in  linear  performance  improvements  since  each  task  will 
always  contain  some  portion  which  must  be  executed  sequentially,  thus  voiding  the 
parallel  processing  capability  available.  Amdahl  [Ref.  1]  explained  that  this  occurrence 
places  an  upper  limit  on  p>otential  gains  made  in  parallel  processing,  regardless  of  the 
number  of  processors  used.  This  paper  assumes,  however,  that  a  given  task  can  be 
divided  sufficiently  to  warrant  a  large  degree  of  parallel  processing. 

Another  obstacle  exists  preventing  parallel  machines  from  achieving  maximum 
potential  performance  gains.  Processors  working  in  parallel  will  need  to  share  data  as  the 
processing  proceeds.  This  sharing  of  resources  among  many  processors  leads  again  to 
the  problem  of  stretching  the  capabilities  of  individual  components.  In  this  situation,  the 
common  data  path  and  common  memory  storage  devices  will  become  a  performance 
restriction  as  the  number  of  processors  involved  is  increased  [Ref.  2:p.  5].  Greater 
advances  in  processor  technologies  over  bus  and  backplane  technologies  further 
exacerbates  this  problem,  since  by  adding  faster  processors  to  a  given  balanced 
multiprocessor  bus,  the  bus  will  become  overloaded  and  hence  slow  the  entire  system. 
3.     Multiple  Processors  Using  Point-To-Point  Communications 

To  avoid  the  potential  bottienecks  associated  with  having  all  processors  sharing 
a  common  communication  resource,  processors  can  communicate  more  directiy  using 
distributed  connections,  that  is,  each  processor  has  its  own  finite  set  of  independent 
connections.  As  the  number  of  nodes  in  a  network  grows,  however,  a  point  is  reached  in 
which  all  pairs  of  processors  will  no  longer  be  directly  connected.  The  transport 
mechanisms  used  for  data  communications  and  which  must  deal  with  these  problems  can 
be  classified  into  two  broad  categories:  packet  switching  and  circuit  switching. 


a.  Packet  Switching 

In  a  large  network  where  all  points  cannot  directly  communicate,  the 
connection  must  be  completed  through  intermediaries.  This  is  known  as  packet  switch- 
ing or  packet  routing.  Data  is  carved  into  blocks  of  convenient  size  and  passed  with 
routing  information  from  node  to  node  until  the  desired  destination  is  reached.  Packet 
routing  is  used  extensively  in  hypercube  architecture  computers  which  are  a  network  of 
processors  symmetrically  interconnected  so  as  to  minimize  the  distance  between  any  two 
nodes.  Although  this  method  is  much  less  restrictive  than  a  shared  bus,  some  overhead  is 
still  incurred  as  processing  time  is  dedicated  to  the  task  of  passing  data. 

b.  Circuit  Switching 

In  order  to  avoid  the  need  for  passing  data  along  intermediate  nodes, 
additional  hardware  can  be  used  to  assign  specific  temporary  paths  between  processors 
wishing  to  communicate.  This  is  known  as  circuit  switching  and  exists  quite  commonly 
in  the  form  of  a  telephone  exchange.  When  one  party  dials  another,  a  connection  is  made 
and  dedicated  for  that  specific  communication  for  the  duration  of  the  call.  When  the  call 
is  complete,  the  connection  is  broken  and  may  be  used  for  a  new  connection.  This  clearly 
avoids  the  overhead  of  intermediate  nodes  passing  data,  however,  new  overhead  is 
created  in  the  task  of  managing  the  available  connections.  In  circuit  switching,  the 
connection  must  be  made  and  broken  for  each  communication,  and  the  event  of  "busy" 
connections  must  be  handled. 

B.    TRANSPUTER  TECHNOLOGY 

Another  technology  has  evolved  which  attempts  to  combine  the  advantages  of  the 
examples  mentioned  above  while  hoping  to  avoid  the  difficulties.  A  device  called  the 
Transputer  combines  a  processor  with  its  own  local  memor>'  storage  and  four  specialized 
data  paths  or  links  designed  to  communicate  directly  with  other  Transputers.  Thus,  as  the 


number  of  processors  and  processing  power  in  the  network  grows,  the  amount  of  memory 
and  links  available  also  expand.  By  providing  its  own  direct  communications  path,  the 
problems  of  bus  contention  are  largely  overcome.  However,  every  Transputer  in  a 
network  larger  than  five  nodes  cannot  communicate  directly  with  each  other  since  each 
Transputer  contains  only  four  hardware  links. 

Although  the  number  of  communication  paths  grows  as  Transputers  are  added  to  the 
network,  establishing  a  dedicated  path  directly  between  any  two  processors  will  not 
always  be  possible.  To  allow  communications  between  any  two  nodes,  the  message  must 
either  be  passed  via  intermediate  nodes  (packet  routing)  or  additional  hardware  must  be 
employed  to  connect  a  dedicated  path  (circuit  switching).  Because  of  the  synchronous 
nature  of  communication  in  Transputers,  packet  routing  may  introduce  deadlock  if  it  is 
not  implemented  carefully  [Ref.  3:p.  110]. 

As  a  further  complication,  growth  in  the  number  of  links  between  Transputers 
increases  the  probability  of  communications  failure.  When  communicating  via  links  a 
failure  of  the  medium  causes  the  link  engines  involved  to  wait  forever  which  will  likely 
result  in  failure  of  the  processing  in  progress.  Even  without  link  failures  or  synchroniza- 
tion limitations,  packet  routing  systems  incur  a  potentially  significant  overhead  in  passing 
data  between  nodes. 

C.    MOTIVATION 

Modem  weapons  systems  depend  heavily  upon  the  availability  of  powerful  proces- 
sors to  monitor  equipment,  evaluate  sensor  inputs,  display  command  and  control 
information  and  calculate  trajectories  and  weapons  intercepts.  Weapons  systems  control 
computers  can  be  described  by  following  characteristics  [Ref.  4:p.  11]: 

•     Physically  distributed:  specialized  computers  are  located  throughout  the  platform, 
each  dedicated  to  specific  tasks  but  also  in  communication  with  the  others, 


•  Fault  tolerant:  loss  of  some  computers  or  some  communications  paths  should  not 
bring  the  entire  integrated  system  to  a  halt, 

•  High  performance:  to  handle  demands  of  sophisticated  sensors,  and 

•  Flexible  and  extensible:  to  respond  quickly  to  a  changing  threat  environment. 

The  AEGIS  Modelling  Laboratory  at  the  Naval  Postgraduate  School  considers  the 
Transputer's  architecture  and  capabilities  as  potentially  useful  to  fulfill  the  greater 
demands  of  future  weapons  systems  [Ref  4:p.  11].  Although  parallel  processors  in 
general,  and  Transputers  in  particular  can  satisfy  the  above  requirements,  communica- 
tions between  processors  may  still  present  unacceptable  restrictions  either  in  ability  to 
link  any  two  Transputers  in  the  network  or  in  delays  in  obtaining  the  connection. 

D.    OBJECTIVES 

According  to  Lauwereins  [Ref.  5:p.  223],  "...The  two  major  problems  encountered 
when  interconnecting  cooperating  microcomputers  are  the  detection  of  parallelism  in  the 
application  program  and  the  overload  of  communication  lines."  No  attempt  is  made  here 
to  develop  methods  for  detecting  parallelism.  Instead,  this  thesis  explores  dynamic 
reconfiguration  and  link  fault  tolerance  in  a  network  of  Transputers  as  a  method  for 
conducting  concurrent  processing  without  prohibitive  control  overhead  and  thereby 
overcoming  the  difficulties  noted  above. 

The  message  exchange  is  a  combination  of  procedures  and  "off-the-shelf  hardware 
allowing  direct  communication  between  any  two  processors  in  the  network.  In  the 
message  exchange,  specific  connections  are  dynamically  requested,  created  and  terminat- 
ed as  required  by  the  application  code  with  the  use  of  program  controlled  switches.  Thus, 
its  development  served  as  a  test  vehicle  in  developing  routines  for  software  controlled 
link  switches  and  link  fault  tolerance.  The  message  exchange  concept  is  in  contrast  to 
packet  routing  architectures  in  which  all  data  as  well  as  control  signals  are  passed  via 
intermediate  nodes  to  reach  non-adjacent  destination  nodes. 


Communication  fault  detection  and  recovery  procedures  are  implemented  to  increase 
the  reliability  of  the  links  and  to  ensure  that  requested  connections  are  eventually 
completed  when  link  resources  become  available.  With  a  functional  message  exchange 
system,  a  programmer  can  insert  specific  task  instructions  within  this  structure  and  have 
program  controlled  access  to  all  processors  in  the  network  without  additional  code  to 
manage  the  communications. 

E.    THESIS  ORGANIZATION 

Chapter  II  describes  in  detail  the  various  tools  used,  including  Transputer  architec- 
ture and  operation,  the  principles  upon  which  it  is  based,  and  fundamentals  of  the 
programming  language  employed. 

Chapter  TIT  describes  the  specific  equipment  upon  which  the  message  exchange  was 
implemented. 

Chapter  IV  explains  the  code  and  structures  of  the  message  exchange  program.  This 
includes  explanations  of  the  various  modules  of  the  program  in  both  normal  operation 
and  in  the  event  of  communications  failure. 

Chapter  V  discusses  the  testing  and  evaluation  of  the  message  exchange  and  the 
aspects  of  program  structure  and  communications  load  which  affect  performance. 

Chapter  VI  presents  the  conclusions  reached  as  a  result  of  development,  using  and 
evaluating  the  message  exchange.  Recommendations  for  further  research  concerning 
Transputers,  integrated  weapons  systems  and  this  project  is  also  included. 


n.    THE    TRANSPUTER 

A.    COMMUNICATING  SEQUENTIAL  PROCESSES 

Transputer  is  a  combination  of  the  words  transistor  and  computer  to  emphasize  its 
intended  purpose:  a  single-chip  processor  to  be  used  as  a  fundamental  building  block  in 
large  collections  of  parallel  processors  in  much  the  same  way  that  transistors  are  basic 
components  of  more  complex  and  capable  devices  [Ref.  6:p.  25].  By  using  a  collection 
of  Transputers,  large  parallel  systems  can  be  constructed  which  operates  concurrently  and 
communicates  through  links.  The  Transputer  was  developed  by  INMOS  Limited  of 
Bristol,  United  Kingdom,  and  has  since  expanded  into  a  family  of  very  large  scale 
integrated  (VLSI)  components  with  different  capabilities.  A  typical  member  of  the 
Transputer  family  is  a  single  chip  containing  processor,  memory,  and  communication 
links  which  provide  point  to  point  connection  between  Transputers. 

Fundamental  to  the  design  of  the  Transputer  is  the  concept  of  a  process  and  how  it 
relates  to  concurrency.  A  process  consists  of  a  list  of  instructions  intended  to  be  executed 
in  sequence,  and  can  be  thought  of  as  the  program  in  execution  in  a  single  processor  or  as 
the  entity  to  which  the  processor  is  currently  assigned  [Ref.  7:p.  55].  This  definition 
applies  not  only  to  parallel  processors  but  to  a  single  processor  whose  processing  power 
is  sequentially  shared,  or  time  sliced,  among  multiple  processes. 

Hoare's  [Ref.  8]  Communicating  Sequential  Processes  (CSP)  is  one  model  for 
concurrent  or  parallel  programming,  and  is  central  to  the  design  of  the  Transputer.  In 
CSP,  a  program  is  a  collection  of  processes  which  can  be  combined  to  execute 
sequentially  on  a  single  processor  or  in  parallel  on  multiple  processors.  Tlie  data  space 
for  any  process  executing  in  parallel  is  disjoint,  thus  alleviating  the  need  for  sharing 


memory  between  processors.  Although  shared  memory  is  not  available,  processes  must 
still  communicate  with  each  other.  Therefore,  CSP  utilizes  message  passing  between  any 
pair  of  parallel  processes  via  declared  communications  channels  between  the  two 
processes. 

B.    PROCESSOR  ARCHITECTURE 

A  block  diagram  of  a  T800  Transputer  is  shown  in  Figure  2.1  and  the  major 
components  are  discussed  in  the  following  sections.  Internally,  a  Transputer  consists  of  a 
memory,  processor  and  communications  system  connected  via  a  32  bit  bus.  The  bus  also 
connects  to  the  external  memory  interface  enabling  additional  local  memory  to  be  used. 

The  processor,  memory  and  communications  system  each  occupy  about  25%  of  the 
total  silicon  area,  the  remainder  being  used  for  power  distribution,  clock  generators  and 
external  connections  [Ref.  9:p.  27]. 

1.     Central  Processing  Unit  (CPU)  and  Sequential  Operation 

The  processor  contains  32  bit  processing  logic,  32  bit  integer  arithmetic  unit, 
instruction  and  work  pointers  and  an  operand  register.  The  on-chip  four  kilobyte  high 
speed  memory,  which  can  store  both  data  and  code,  is  directly  accessed  by  the  processor. 
Where  larger  amounts  of  memory  are  required,  the  processor  can  access  a  maximum  of 
four  gigabytes  of  memory  via  the  external  memory  interface  [Ref9:p.  47].  Figure  2.2 
shows  a  block  diagram  of  the  processor. 
a.     Register  Set 

The  design  of  the  Transputer  processor  exploits  the  availability  of  fast 
on-chip  memory  by  employing  only  six  registers  used  in  the  execution  of  a  sequential 
process.  The  combination  of  very  few  registers  and  a  simple  instruction  set  enables  the 
processor  to  have  relatively  simple  and  fast  data  paths  and  control  logic.  The  six  registers 
are[Ref9:p.  28]: 
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Figure  2.1.  Transputer  Architecture  Block  Diagram 


•  The  workspace  pointer  (W)  which  points  to  an  area  of  memory  storage  where 
local  variables  are  kept. 

•  The  instruction  pointer  (I)  which  points  to  the  next  instruction  to  be  executed. 

•  The  operand  register  (O)  which  is  used  in  the  formation  of  instruction  operands. 

•  The  A,  B  and  C  registers  which  form  an  evaluation  stack. 

A,  B  and  C  are  sources  and  destinations  for  most  arithmetic  and  logical 
operations.  Loading  a  value  into  the  stack  pushes  B  into  C  and  A  into  B  before  loading 
A.  Storing  a  value  from  A  pops  B  into  A  and  C  into  B. 
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Figure  2.2.  Transputer  Processor  Block  Diagram 

Expressions  are  evaluated  on  the  evaluation  stack  and  instructions  refer  to 
the  stack  implicitly.  For  example,  the  add  instruction  pops  the  first  two  values  (A  and  B) 
from  the  stack,  performs  the  addition,  then  pushes  the  result  back  onto  the  stack.  The  use 
of  a  stack  removes  the  need  for  additional  instructions  to  explicitly  manuever  data  during 
the  performance  of  an  operation.  Statistics  gathered  from  a  large  number  of  programs 
show  that  three  registers  provide  an  effective  balance  between  code  compactness  and 
implementation  complexity.  [Ref  9:p.  47] 
b.     Machine  Instruction  Format 

The  Transputer  instruction  set  has  been  designed  for  simple  and  efficient 
compilation  of  high-level  languages.  All  instructions  have  the  same  format  and  are 
designed  to  give  a  compact  representation  of  frequent  program  operations.  Instructions 
are  eight  bits  long  and  divided  into  two  parts.  The  low-order  four  bits  are  the  data  and 
the  high-order  four  bits  are  the  opcode  or  function  as  shown  in  Figure  2.3.  The  data  is 
loaded  into  the  lower  four  bits  of  the  32  bit  operand  register  which  is  operated  upon  by 
the  opcode.  This  allows  32  bits  of  data  to  be  used  if  required. 
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Instructions  operate  on  the  entire  Operand  Register  as  the  operand 
Ail  intructions  load  their  data  into  the  lower  4  bits 

«                                         7              43                0 

T                                        Function        Data 

O     -^ 

Operand  Register 

31                                                         4  3              0 

Figure  2.3.  Instruction  Format 

Four  bits  can  provide  identification  for  16  instructions  which  are  known  as 
direct  functions.  Thirteen  direct  functions  actually  manipulate  the  processor.  These 
single  byte  instructions  are  the  most  frequently  used  such  as  store,  load,  calls  and  jumps. 
The  three  remaining  instructions:  Pfix,  Nfix,  and  Opr  manipulate  the  operand  register  by 
constructing  and  executing  larger  instructions.  [Ref  9:p.  29] 

Pfix  loads  its  four  data  bits  into  the  operand  register  then  shifts  the  operand 
register  left  four  bits.  Nfix  is  similar  except  the  contents  of  the  operand  register  are 
complemented  prior  to  shifting  left.  Consequently,  operands  can  be  extended  to  any 
length  up  to  the  length  of  the  operand  register  by  a  sequence  of  prefix  instructions. 
Operands  in  the  range  of  -256  to  +255  (eight  bits  and  two  operation  types)  can  be 
represented  using  one  prefix  instruction.  Finally,  Opr  causes  its  operand  to  be  interpreted 
as  an  operation,  known  as  an  indirect  function,  to  be  executed  on  the  values  held  in  the 
evaluation  stack.  This  allows  up  to  16  such  instructions  to  be  encoded  in  a  single  byte 
instruction.  However,  the  prefix  instructions  can  be  used  to  extend  the  operand  of  an  Opr 
instruction  just  like  any  other.  The  instruction  representation  therefore  provides  for  an 
indefinite  number  of  operations.  [Ref  9:p.  30] 
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Encoding  of  the  indirect  functions  is  chosen  so  that  the  most  frequentiy 
occurring  operations  are  represented  without  the  use  of  a  prefix  instruction.  These 
include  arithmetic,  logical  and  comparison  operations  such  as  add,  exclusive  or  and 
greater  than.  Less  firequendy  occurring  operations  have  encoding  which  require  a  single 
prefix  operation.  Measurements  show  that  about  70%  of  executed  instructions  are 
encoded  in  a  single  byte;  that  is,  without  the  use  of  prefix  instructions.  Many  of  these 
instructions,  such  as  load  constant  and  a^  require  just  one  processor  cycle  [Ref  9:p.  49]. 
2.     Link  Engines  and  Communications 

Four  identical  INMOS  bidirectional  links  provide  communication  between 
processors,  and  between  processors  and  other  compatible  signal  sources.  Each  link 
consists  of  an  input  and  output  channel,  and  can  perform  simultaneous,  independent  input 
and  output  communications.  Instructions  are  included  in  the  Transputer's  instruction  set 
for  performing  input  and  output  operations  using  the  links.  A  block  diagram  of  a 
communications  link  is  shown  in  Figure  2.4.  Each  link  engine  also  includes  an 
independent  direct  memory  access  (DMA)  controller  and  serial  communication  logic. 
Initiating  a  transfer  down  a  link  takes  about  a  microsecond  (20  processor  cycles),  but 
once  the  transfer  is  started  it  will  proceed  with  minimal  direction  from  the  processor, 
consuming  typically  only  four  processor  cycles  (per  active  link  engine)  every  four 
microseconds  [Ref.  10:p.  15].  Although  communications  are  synchronized  between 
sender  and  receiver,  they  are  not  synchronous  with  clock  input  (Clockln)  to  the 
Transputer  and  are  insensitive  to  phase.  Thus,  links  from  independentiy  clocked  systems 
may  communicate  providing  only  that  frequencies  are  within  specifications 
[Ref9:p.  370]. 
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Figure  2.4.  Hardware  Communication  Link  Block  Diagram 

To  execute  an  external  communication  via  a  hardware  link,  the  communication 
link  address,  size  and  location  of  a  message  are  specified.  This  initializes  the  DMA 
controller  with  the  size  and  location  of  the  message  to  be  communicated.  The  process 
executing  the  communication  instruction  is  suspended  and  a  pointer  for  later  resuming 
the  process  is  saved.  If  available,  another  ready  process  from  an  active  process  list  may 
then  be  executed.  When  both  the  sending  and  receiving  links  have  been  initialized  in  this 
manner,  the  message  communication  proceeds.  When  the  communication  is  complete, 
the  process  which  was  suspended  for  communication  is  allowed  to  resume.  A  more 
detailed  description  of  parallel  process  management  is  provided  in  [Ref  9]. 

Messages  are  transmitted  by  the  links  one  byte  at  a  time  in  a  bit-serial  format. 
After  a  receiver  has  recognized  the  reception  of  a  byte  and  is  capable  of  receiving 
another,  the  receiver  transmits  an  acknowledge  message.  The  transmitter  will  await 
reception  of  the  acknowledge  message  before  transmitting  the  next  message  byte.  Since 
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the  link  hardware  performs  no  error  checking  on  messages,  the  purpose  of  the 
acknowledge  message  is  solely  to  control  the  flow  of  message  bytes  between  the  links. 
Figure  2.5  shows  a  formatted  message  byte  and  an  acknowledge  message.  INMOS 
recommends  that  error  checking  should  be  provided  in  software  if  external  link  length 
exceeds  0.9  meters.  Additional  hardware  is  available  which  can  translate  the  link 
protocol  into  forms  suitable  for  long  haul  communications  and  formats  compatible  with 
different  system.  Information  regarding  INMOS  link  connections  and  conversion  to  and 
from  other  protocols  or  signal  technologies,  including  fiber  optics,  can  be  found  in 
[Ref.  11],  and[Ref.  12]. 
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Figure  2.5.  Serial  Communication  Link  Protocol 

Internal  communications,  that  is,  communications  between  parallel  processes 
executing  on  a  single  Transputer,  can  also  be  performed.  In  this  case,  memory  transfers 
and  software  register  equivalents  are  used  instead  of  the  DMA  link  engines  since  all 
activity  takes  place  within  the  same  processor.  A  memory  address  and  the  size  and 
location  of  a  message  are  specified,  then  the  communications  instruction  is  executed  just 
as  in  the  external  communication  case.  When  both  sending  and  receiving  processes  are 
ready  to  communicate,  the  task  is  accomplished  by  performing  a  memory-to-memory 
transfer  of  the  message  data.  From  the  high  level  language  programmer's  view,  internal 
and  external  link  communications  are  handled  identically. 
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3.     Memory 

Memory  addressing  starts  at  the  most  negative  32  bit  address  (hexadecimal 
80000000)  with  four  kilobytes  of  fast,  static  memory  on-chip  for  high  rates  of  data 
throughput  Each  memory  access  takes  one  processor  cycle  [Ref  9:p  70].  It  is  important 
to  note  that  the  on-chip  memory,  which  is  rated  at  50  nanosecond  cycle  time,  is  not  used 
as  a  memory  cache  for  off-chip  memory,  but  as  the  first  portion  of  one  linear  address 
range.  A  total  of  four  gigabytes  of  memory  can  be  addressed  by  the  Transputer  and 
accessed  at  a  sustained  rate  of  26.6  megabytes  per  second  (at  20  MHz  clock  speed) 
[Ref  9:p.  43],  some  of  which  is  dedicated  for  specific  purposes  in  Transputer  operation. 
A  memory  map  is  shown  in  Figure  2.6. 
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Figure  2.6.  T800  Transputer  Memory  Map 

Off-chip  memor>'  is  accessed  through  an  external  memory  interface  controlling  a 
multiplexed  32  bit  address  and  data  bus.  Since  this  one  bus  must  be  shared  between 
address  and  data  information  access  to  external  memory  is  considerably  slower  than 
on-chip.  A  range  of  three  to  five  processor  cycles  is  typically  required  to  access  off-chip 
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memory.  For  this  reason,  the  programmer  pursuing  performance  optimization  should  be 
aware  of  code  and  data  placement  in  memory  and  ensure  that  frequently  used  variables 
and  code  segments  are  placed  in  on-chip  memory.  If  faced  with  a  choice,  on-chip 
memory  is  better  suited  for  data,  since  four  instructions,  eight  bits  each,  can  be  obtained 
in  one  32  bit  fetch,  hence  code  accesses  are  less  frequent  [Ref  10:p.  2]. 

Additional  hardware  is  provided  on  the  external  memory  interface  to  provide 
refresh  cycles  for  dynamic  random  access  memory.  Control  lines  and  signals  are  also 
provided  to  facilitate  peripheral  device  direct  memory  access  to  the  external  segment  of 
memory. 

4.     Timers 

The  Transputer  provides  two  32  bit  timer  clocks  which  'tick'  periodically. 
These  timers  provide  accurate  process  timing,  allowing  processes  to  deschedule  them- 
selves in  a  specific  time  and  allowing  the  programmer  to  execute  instructions  at  either  an 
absolute  time  or  after  a  relative  interval  from  a  current  time.  One  timer  is  accessible  only 
to  high  priority  processes  and  is  incremented  every  microsecond,  cycling  completely  in 
approximately  4295  seconds.  The  other  is  accessible  only  to  low  priority  processes  and 
is  incremented  every  64  microseconds  giving  exactly  15625  ticks  in  one  second.  It  has  a 
full  period  of  approximately  76  hours.  [Ref  9:p.  52] 

Instructions  are  provided  for  initializing  and  reading  the  current  timer  values. 
Additionally,  instructions  are  provided  for  suspending  execution  of  a  process  until  a 
specified  timer  value  is  reached.  To  implement  this  instruction,  the  Transputer  maintains 
a  linked  list  of  suspended  processes  waiting  on  timer  values.  Separate  lists  are 
maintained  for  the  high  and  low  priority  timers. 

A  process  can  arrange  to  perform  a  timer  input  in  which  case  it  will  become 
ready  to  execute  after  a  specified  time  has  been  reached.  The  timer  input  instruction 
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requires  a  time  to  be  specified.  If  this  time  is  in  the  past  then  the  instruction  has  no 
effect.  If  the  time  is  in  the  future  the  process  is  deschedule  and  rescheduled  when  the 
time  is  reached. 

5.  External  Event  Input 

EventReq  and  EventAck  provide  an  asynchronous  handshaking  interface 
between  an  external  event  and  an  internal  process  and  behaves  similarly  to  an  external 
interrupt  input.  To  the  Transputer,  this  external  event  input  appears  as  a  communications 
channel  which  is  capable  of  transmitting  a  signal  to  a  user's  program. 

A  user's  program  requesting  input  firom  the  external  event  channel  will  be 
suspended  if  the  external  event  input  is  not  being  asserted.  Then,  when  the  external  event 
input  is  asserted  the  process  will  be  added  to  the  end  of  an  active  process  list  to  wait  its 
turn  for  execution.  Either  high  or  low  priority  processes  may  request  input  from  the 
external  event  channel,  but  only  one  process  may  use  the  event  channel  at  any  given  time. 

6.  Floating  Point  Unit  (FPU) 

The  64  bit  FPU  of  the  TSOO  Transputer  provides  single  and  double  length 
arithmetic  floating  point  standard  ANSI-IEEE  745-1985.  It  is  able  to  perform  floating 
point  arithmetic  concurrentiy  with  the  CPU  sustaining  in  excess  of  2.25  Mflops  on  a  30 
MHz  device.  All  data  communication  between  memory  and  the  FPU  occurs  under 
control  of  the  CPU.  [Ref  9:p.  62] 

The  FPU  consists  of  a  microcoded  computing  engine  with  a  three  element 
floating  point  evaluation  stack  for  manipulation  of  floating  point  numbers.  Each  stack 
register  can  hold  either  32  bit  or  64  bit  data,  indicated  by  an  associated  flag  which  is  set 
when  a  floating  point  value  is  loaded.  The  FPU  stack  behaves  similar  to  the  CPU  stack 
described  in  Chapter  II. 
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7.     Benchmarks  Performance  Comparisons 

Shepherd  [Ref.  13:pp.  10-13]  provides  the  benchmark  data  shown  in  Table  2.1 
comparing  members  of  the  Transputer  family  with  various  other  machines.  Dhrystones 
and  Whetsones  are  standard  algorithms  used  in  measurement  of  processor  capability. 
Reference  13  includes  the  test  code  used,  as  well  as  complete  reference  listings  of  all 
third-party  tests  and  vendor  publications. 

TABLE  2.1 

TRANSPUTER  PERFORMANCE  BENCHMARK  COMPARISONS 


Computer  Evaluated 

Dhrystones 

Single  Precision 

Double  Precision 

all  figures  in  thousands 

per  Second 

Whetstones 

Whetstones 

(n/a:  figures  not  available) 

IBM  3090/200 

31,250 

n/a 

n/a 

T800-30  (projected) 

n/a 

6,000 

3,800 

T800-20 

8,547 

4,000 

2,500 

VAX  8600 

6,423 

n/a 

n/a 

Intel  80386-16 

4,300 

n/a 

n/a 

Motorola  68020-17 

3,977 

n/a 

n/a 

Intel  80286/80287 

1,976 

300 

n/a 

VAX  11-780  (-hFPA) 

1,650 

1,083 

715 

SUN  3 

n/a 

860 

790 

T4 14-20 

n/a 

663 

163 

Motorola  MC  68000-8 

1,136 

n/a 

n/a 

IBM  PC-RT  (-hFPA) 

n/a 

200 

n/a 

Intel  8086/8087 

n/a 

178 

152 

IBM  PC-RT 

n/a 

12 

n/a 

C.    PROCESSOR  OPERATION  -  PARALLEL  MODE 
1.     Process  Representation 

In  both  sequential  and  parallel  operation,  the  fundamental  unit  of  execution  is 
the  process.  A  process  may  consist  of  many  sub-processes  executing  concurrently  by 
sharing  the  processor's  resources.  A  process  may  be  assigned  as  either  a  high  or  low 
priority  in  execution.  High  priority  processes  execute  without  external  interruption, 
while  low  priority  processes  run  until  blocked  by  communications  or  timer  inputs. 
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Parallel  operation  is  based  on  the  following  assumptions  [Ref  4:p.  24]: 

•  The  quickest  context  switch  is  made  by  saving  the  least  amount  of  data  for  any 
given  process, 

•  A  process  must  perform  I/O,  or 

•  A  process  not  performing  I/O  is  in  a  loop  and  must  eventually  execute  a  loop  end 
instruction  or  jump  instruction,  and 

•  A  high  priority  process  needs  to  execute  as  soon  as  it  is  ready,  and  executes  until 
completion  or  blocked  for  communication. 

A  concurrent  process  may  be  active  at  any  time,  in  which  case  it  is  either 
executing  or  on  a  list  waiting  to  t>e  executed.  Inactive  processes  are  ready  to  input,  ready 
to  output,  or  waiting  until  a  specified  time. 

Concurrency  administration  in  the  Transputer  is  implemented  in  hardware, 
unlike  most  multi-tasking  systems  in  which  control  takes  place  in  software.  The 
following  hardware  support  is  provided  for  implementation  of  parallel  operations 
[Ref9:p.  31]: 

•  Two  Timing  Registers, 

•  Four  Process  Queue  Registers,  and 

•  Special  registers  for  saving  some  process  context  switch  data. 
2.     Process  Priority  and  Interrupts 

Each  concurrent  process  is  represented  by  a  vector  of  words  in  memory  called 
the  process  workspace  as  shown  in  Figure  2.7.  A  process  is  in  one  of  three  states: 
executing,  ready,  or  blocked  and  the  Workspace  Pointer  Register  points  to  the  executing 
process.  This  space  is  used  to  hold  the  local  variables  and  temporary  values  manipulated 
by  a  process.  The  workspace  is  organized  as  a  falling  stack  with  end-of-stack  addressing. 
All  local  variables  are  addressed  as  positive  offsets  from  the  Workspace  Pointer. 
[Ref  9:pp.  28-31] 
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Memory 


bcal  variable  2 


local  variable  1 


local  variable  0 


Workspace: 

used  to  hold  local  variables  and 
temporary  values  manipulated 
by  the  process. 


Workspace  Pointer 

Instruction  Pointer 
(fordescheduled  processes) 


J-  Linkage  Information 

(for  scheduling  communication 
and  timer  inputs) 


Figure  2.7.  Process  Workspace 

3.     Process  Scheduling 

The  scheduler  operates  to  avoid  having  inactive  processes  consume  any 
processor  time.  Active  processes  waiting  to  be  executed  are  held  on  a  linked  list  of 
workspaces  implemented  using  two  registers:  one  pointing  to  the  first  process  on  the  list 
and  the  other  to  the  last.  Figure  2.8  shows  process  S  in  execution,  processes  R,  Q  and  P 
awaiting  execution. 

A  high  priority  process  is  executed  (chosen  in  order  of  receipt)  until  it  is  unable 
to  proceed  because  it  is  awaiting  an  input  or  output,  or  waiting  for  the  timer.  Whenever  a 
process  is  unable  to  proceed,  its  instruction  pointer  is  saved  in  its  workspace  and  the  next 
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process  is  taken  from  the  list.  Actual  process  switching  times  are  very  small  as  little 
process  state  information  needs  to  be  saved;  even  the  evaluation  stack  is  not  saved  in 
process  rescheduling  [Ref  9:p.  31]. 


Registers 

r^ 

Locals 

-/ 

/^ 

Prograrri 

Front 

^^^ 

P 

^^ 

/-> 

Back 

Q 

r^ 

A 

R 

B 

^ 

^ 

C 

S 

Workspace 

"^ 

Next  Instruction 

Operand 

Figure  2.8.  Linked  Process  List 

When  a  low  priority  task  is  executing  and  a  high  priority  task  is  ready,  it 
preempts  the  low  priority  task.  Generally,  this  takes  place  at  the  end  of  the  current 
instruction.  Some  instructions,  like  block  moves  and  I/O,  are  interruptible.  In  the  event 
of  such  an  interrupt  the  state  of  the  low  priority  process  is  saved  in  special  system 
memory  locations  at  the  low  end  of  on-chip  memory  and  the  workspace  pointer  is  placed 
at  the  head  of  the  low  priority  queue.  Note:  high  priority  tasks  are  not  pre-empted. 
[Ref  4:pp.  25-28] 

The  process  context  switch  time  is  low  since  only  six  registers  need  be  saved 
and  stored  in  fast,  on-chip  memor>'.  There  is  full  machine  instruction  level  support  for 
context  switching  which  yields  very  low  overheads.  Occam  context  switching  overhead, 
about  one  microsecond,  is  comparable  to  conventional  control  structures  (e.g.,  loops, 
procedure  calls).  [Ref.  14:p.  139] 
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4.     Time  Slice  Periods 

Low  priority  processes  share  the  processor  in  time  sliced  intervals.  A  time  slice 
period  is  defined  as  5120  cycles  of  the  external  five  megahertz  clock,  or  about  one 
millisecond.  When  a  running  low  priority  process  has  consumed  its  allotted  time  slice, 
the  scheduler  attempts  to  deschedule  the  executing  process  to  make  the  processor 
available  for  the  next  processes  on  the  queue.  [Ref  9:pp.  50-51] 

Processes  are  descheduled  at  the  end  of  their  time  slice,  or  shortly  thereafter  (to 
allow  for  completion  of  the  currently  executing  instruction),  and  added  to  the  end  of  the 
low  priority  queue.  In  general,  the  minimum  period  of  time  for  time-slicing  low  priority 
processes  is  one  millisecond  with  the  expected  maximum  period  being  two  milliseconds. 
Given  n  low  priority  processes  and  no  high  priority  processes,  the  maximum  time  a 
process  will  wait  in  the  process  queue  for  CPU  time  is  2n-2  time  slice  periods. 
[Ref9:p.  51] 

D.  PROGRAMMING  LANGUAGE  -  OCCAM 

Transputers  can  be  programmed  in  most  high  level  languages,  and  are  designed  to 
ensure  that  compiled  programs  will  be  efficient.  Where  it  is  required  to  exploit 
concurrency  and  still  to  use  standard  languages,  Occam  can  be  used  as  a  harness  to  link 
modules  written  in  selected  languages.  Occam  and  Transputers  were  designed  together 
based  on  Hoare's  CSP.  Transputers  include  special  machine  instructions  and  hardware  to 
provide  maximum  performance  and  optimum  implementations  of  the  Occam  model  of 
concurrency  and  communications.  Therefore,  to  gain  the  most  benefit  from  the  Transput- 
er architecture,  the  whole  system  can  be  programmed  in  Occam.  This  provides  all  the 
advantages  of  a  high  level  language,  maximum  program  efficiency  and  the  ability  to  use 
the  special  features  of  the  Transputer.  [Ref  9:p.  3] 
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Many  programming  languages  depend  on  the  existence  of  the  uniformly  accessible 

memory  found  in  conventional  single  processor  computers.  As  discussed  in  Chapter  I, 

such  an  architecture  would  likely  require  the  existence  of  a  shared  bus  which  itself 

becomes  a  bottleneck  as  the  number  of  processors  in  a  multiprocessor  system  grows.  The 

aim  of  Occam  is  to  remove  this  difficulty  by  expressing  arbitrarily  large  systems  in  terms 

of  localized  processing  and  communication.  INMOS  describes  Occam  as  follows: 

The  main  design  objective  of  Occam  was  therefore  to  provide  a  language  which 
could  be  directly  implemented  by  a  network  of  processing  elements,  and  could 
directly  express  concurrent  algorithms.  In  many  respects,  Occam  is  intended  as  an 
assembly  language  for  such  systems;  there  is  a  one-to-one  relationship  between 
Occam  processes  and  processing  elements,  and  between  Occam  channels  and  links 
between  processing  elements.  [Ref  2:p.  5] 

Occam  can  further  be  considered  an  assembly  language  in  that  memory  is  viewed  as 
a  linear  object  and  all  resources  are  assigned  at  compile  time  thereby  precluding 
recursion  or  dynamic  resource  allocation.  The  general  simplicity  of  the  language  is 
recognized  by  being  named  after  William  of  Occam,  a  14th  century  philosopher  who  is 
responsible  for  the  adage  (known  as  Occam's  razor):  "Entities  are  not  to  be  multiplied 
beyond  necessity"  [Ref.  15:p.  1].  Hoare  [Ref.  16:pp.  75-83]  echoes  this  idea  by  stating, 
"...There  are  two  ways  of  constructing  a  software  design.  One  way  is  to  make  it  so 
simple  that  there  are  obviously  no  deficiencies  and  the  other  way  is  to  make  it  so 
complicated  that  there  are  no  obvious  deficiencies.  The  first  is  far  more  difficult..." 

Another  attractive  feature  of  Occam  is  that  its  semantics  can  be  formally  stated. 
Welch  [Ref  14:p.  146]  describes  Occam  as  a  formal  logic  for  the  expression,  transforma- 
tion and  qualitative  analysis  of  parallel,  sequential  and  nondeterministic  systems.  By 
analysis  of  a  language  which  adheres  to  formal  rules  it  becomes  possible  to  analyze  a 
program  in  order  to  prove  that  it  is,  for  example,  deadlock  free.  Formal  techniques  can 
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also  be  used  to  transform  a  program  into  different  but  equivalent  forms  for  either 
efficiency  or  readability,  or  to  adapt  parallel  code  to  run  on  a  single  processor  and  vice 
versa  [Ref  15:p.  132]. 

1.  Processes 

Occam  enables  an  application  to  be  described  as  a  collection  of  processes  which 
operate  concurrently  and  communicate  through  channels.  In  such  a  description,  each 
Occam  process  describes  the  behaviour  of  one  component  of  the  implementation,  and 
each  channel  describes  a  connection  between  components.  The  design  of  Occam  allows 
the  components  and  their  connections  to  be  implemented  in  many  different  ways.  This 
additionally  allows  the  choice  of  implementation  technique  to  suit  available  technology, 
to  optimize  performance,  or  to  minimize  cost  [Ref  2:p.  5]. 

Each  process  can  be  regarded  as  an  entity  with  internal  state,  which  can 
communicate  with  other  processes  using  direct  communication  channels.  Processes  can 
be  used  to  duplicate  the  behaviour  of  specific  entities  such  as  a  logic  gate  or  a 
microprocessor.  Processes  are  finite:  each  starts,  performs  a  set  of  instructions  then 
terminates.  Processes  may  be  nested  within  each  other  much  as  subprocedures  are 
contained  within  procedures  in  other  languages. 

2.  Constructs 

a.     Sequential:  SEQ 

A  fundamental  building  block  of  sequential  code  is  the  SEQ  statement. 
Occam  code  is  written  sensitive  to  horizontal  indentation  of  the  each  line.  Blocks  are 
defined,  therefore,  by  their  depth  of  indentation.  A  block  of  sequential  code  is  headed  by 
the  keyword  SEQ  and  followed  by  the  code  indented  two  spaces  as  in  the  example 
below.  The  ":="  is  the  assignment  symbol  which  is  one  of  Occam's  three  primitives. 
The  other  primitives,  "!"  and  "?",  are  discussed  below. 
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SEQ 

X  :=  a  +  b 

y  :=  c  *   d 

z  :=  X  /  y 

b.     Communication 

A  connection  formed  between  two  concurrent  processes  is  known  as  a 
channel  and  implemented  either  internally  in  the  case  of  the  two  processes  existing  on  the 
same  processor,  or  via  hardware  links  in  the  case  of  multiple  processors.  Communica- 
tions are  synchronized  and  unbuffered.  If  a  channel  is  used  for  input  in  one  process  and 
output  in  another,  communication  takes  place  when  both  processes  are  ready.  Since  a 
process  may  have  concurrency,  it  may  have  many  input  and  output  channels  performing 
communication  at  the  same  time. 

Channel  input  "?"  and  channel  output  "!"  along  with  the  assignment  ":=" 
form  Occam's  three  primitives.  An  example  in  Figure  2.9  below  shows  use  of 
communication  constructs  in  a  simple  buffer.  The  process  as  pictured  communicates  via 
two  channels  declared  "in"  and  "out".  A  series  of  variables  assigned  to  "item"  are 
sequentially  taken  in  and  sent  out  of  the  process. 


WHILE    TRUE 

SEQ                                        in    ^ 

item 

out 

in    ?    item 

out    !    item 

Figure  2.9.  Single  Element  Buffer 

c.     Parallel:  PAR 

The  parallel  construct  keyword  is  PAR  and  has  placement  rules  similar  to 
SEQ.  Unlike  SEQ,  however,  each  line  below  PAR  in  the  next  indentation  will  be 
logically  executed  in  parallel  and  considered  concurrent  processes.  In  the  example  here 
the  three  statements  are  executed  simultaneously.  [Ref.  17:pp.  15-17] 
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PAR 

in  ?  itema 

out  !  itemb 

X  :—  itema  +  itemb 

The  parallel  construct  is  unique  to  Occam  and  provides  a  straightforward 
way  of  writing  programs  which  directly  reflects  the  concurrency  inherent  in  real  systems. 
A  more  involved  example  using  a  two  stage  buffer  is  shown  in  Figure  2.10  below.  Note 
the  indentation  of  the  SEQ  blocks  indicating  the  two  executing  parallel  processes. 


PAR 

WHIT.F.  TRUE 
SEQ 

in  ?  itema 
ch  !  itema 
SEQ 

ch  ?  itemb 
out  !  itemb 

inj 

itema 

ch,, 

itemb 

outl 

Figure  2.10.  Two  Element  Buffer 

A  priority  level  can  be  used  with  the  PAR  construct  to  form  high  and  low 

priority  processes.  The  PRI  PAR  construct  designates  the  first  process  in  the  following 

indentation  level  as  high  priority,  and  the  rest  as  low  priority.  If,  as  in  the  example 

below,  there  are  five  processes  executed  in  parallel  and  it  is  desired  that  two  be  assigned 

high  priority,  a  combination  of  PRI  PAR  and  PAR  can  be  used: 

PRI    PAR 

PAR       —  high  priority 

processl 

process2 
process3   —  low  priority 
process4 
processS 
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Since  the  PAR  construct  is  more  difficult  to  implement  and  more  demand- 
ing of  CPU  resources  it  should  not  be  used  in  situations  where  SEQ  would  suffice. 
Bums  [Ref  15:p.  29]  states  that  PAR  should  be  used  instead  of  SEQ  when: 

•  It  more  accurately  reflects  the  properties  of  the  program, 

•  It  allows  the  subprocesses  to  be  arranged  and  thus  enables  program  transforma- 
tions to  be  applied, 

•  It  allows  channel  operations  to  be  executed  in  an  order  determined  by  the 
dynamics  of  the  program's  execution  rather  than  by  a  predefined  and  meaningless 
sequence. 

d.     Alternation 

The  alternative  construct,  ALT,  waits  until  one  of  the  listed  conditions 

becomes  true,  then  performs  that  guard  and  any  subsequent  code  grouped  with  it.  The 

ALT  construct  provides  a  formal  high  level  language  method  of  handling  external  and 

internal  events  that  are  usually  handled  by  assembly  level  programming  in  conventional 

microprocessors.  A  guard  under  an  ALT  is  usually  a  channel  input  but  can  be  other 

boolean  conditions  or  combinations.  In  the  first  example  below,  two  channels  are 

awaiting  input  and  each  has  a  process  associated  with  it.  The  process  associated  with  the 

first  channel  that  becomes  ready  will  be  executed.  The  next  example  shows  a  boolean 

condition     and    combination    channel    input    and    boolean     as     as    ALT     guards. 

[Ref  17:pp.  18-19] 

ALT  ALT 

cl  ?  X  (X  <  y  +  5)  &  SKIP 

processl  process3 

c2  ?  y  c3  ?  z  &  continue 

process2  process4 

Alternation  is  entirely  nondeterministic  in  that  the  choice  taken  is  not 
predictable  since  it  depends  upon  actions  performed  by  other  independent  concurrent 
processes.  Furthermore,  if  more  than  one  choice  in  an  alternation  becomes  ready 
simultaneously,  the  guard  selected  is  undefined  in  Occam.  In  practice,  however,  the  last 
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defined  of  the  guards  becoming  ready  simultaneously  will  be  selected.  This  is  stricdy  a 

matter  of  the  compiler's  implementation  of  the  language  and  not  of  the  language 

definition.  In  order  to  force  selection  of  a  particular  guard  when  more  than  one  become 

ready  simultaneously,  the  PRI  ALT  can  be  used  to  give  preferential  treatment  to  the  first 

guard  in  the  list.  [Ref  17:p.  72] 

€.     Loop  and  Selection 

Loops  and  conditional  selections  in  Occam  are  very  similar  to  other  high 

level  languages.  Conditions  can  be  any  legal  expression  which  evaluates  to  a  boolean 

value.  The  constructs  follow  the  same  rules  of  blocking  and  indentation  as  mentioned 

above  with  the  SEQ  keyword  used  to  head  a  block.  The  examples  below  show  a  finite 

and  infinite  WHILE  loops. 

WHILE  (X  -  5)  >  0              WHILE  TRUE 

SEQ  ALT 

x:=x-2  cl?x 

i  :=  i  +  1  c2  !  x 

Selection  is  accomplished  with  IF  and  CASE  constructs  typical  in  many 

languages.  The  code  associated  with  the  first  expression  which  evaluates  to  true  will  be 

executed  in  an  IF  with  multiple  choices.  To  ensure  completion  of  the  IF  construct  in 

situations  where  it  is  possible  none  of  the  conditions  are  met  the  last  condition  should 

always  be  TRUE  and  its  associated  code  may  be  a  SKIP.  For  clarity,  a  boolean  constant 

"otherwise"  can  be  declared  and  set  to  TRUE  to  replace  the  keyword  as  the  last  choice  in 

the  IF  construct: 

IF 

(x  -  5)  >  0     —  first  condition  to  be  tested 

x  :=  X  *  2 
(x  =  y)         —  next  condition  to  be  tested 

y  :=  X  -  5 
otherwise       —  otherwise  assigned  TRUE 

SKIP 
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/.      Replication 

A  replicator  is  used  with  SEQ,  PAR,  ALT  or  IF  constructions  to  replicate 

the  component  process  a  number  of  times.  This  ability  allows  for  creation  of  multiple 

code  segments,  each  with  potentially  different  parameters  based  on  the  incremented  index 

of  the  process.  For  example,  a  replicator  can  be  used  with  SEQ  to  act  as  a  conventional 

loop  or  with  a  PAR  to  construct  an  array  of  concurrent  processes  PI,  P2,  ...  Pn 

[Refl7:pp.  20-21]: 

SEQ   i   =   0   FOR  n  PAR   i   =   0   FOR  n 

X    :=  X   *    2  Pi 

3.     Configuration 

Assignment  of  processes  to  specific  processors  is  called  configuration.  In  a 
sense  the  configuration  section  of  an  Occam  program  can  be  considered  as  the  "main" 
block  in  other  languages  in  which  the  working  code  primarily  calls  previously  defined 
procedures  and  passes  variables.  Link  assignments  and  global  variable  passing  is 
performed  during  configuration.  The  keyword  PLACE  ...  AT  is  used  to  assign  processor 
links  to  processes. 

Since  each  processor  executes  its  code  concurrendy  with  other  processors,  a 
variation  of  the  PAR  construct,  PLACED  PAR  is  used.  In  the  example  below  a 
network  of  processors  will  be  assigned  the  process  "process. a"  using  a  replicated 
PLACED  PAR.  Note  that  each  process. a  is  also  passed  as  a  parameter  the  value  of  its 
respective  "index"  which  may  be  used  in  specializing  the  actions  of  each  replicated 
process. 

PLACED  PAR  index  =  0  FOR  number , of .processors 
PROCESSOR  nxunber.  of  .processors 
PLACE  input . channel  AT  linkO.in 
PLACE  output . channel  AT  link2.out 

process. a  (input . channel,  output .channel,  index) 
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4.     Program  Development 

Considering  that  parallel  programming  involves  multiple  processors,  in  most 

cases  there  will  be  many  ways  to  perform  the  same  task.  Finding  the  optimum 

arrangement  of  processes  and  priorities  will  involve  experimentation.  Another  design 

intent  of  Occam  is  that  the  logical  behavior  of  a  program  is  independent  of  how  the 

processes  are  assigned  to  processors.  According  to  INMOS: 

It  is  guaranteed  that  the  logical  behaviour  of  a  program  is  not  altered  by  the 
way  in  which  the  processes  are  mapped  onto  processors,  or  by  the  speed  of 
processing  and  communication.  Consequently  a  program  ultimately  intended  for  a 
network  of  Transputers  may  be  compiled,  executed  and  tested  on  a  single  computer 
used  for  program  development.  Even  if  the  application  uses  only  a  single  Transput- 
er, the  program  can  be  designed  as  a  set  of  concurrent  processes  which  could  run  on 
a  number  of  Transputers.  [Ref9:p.  18] 

This  guarantee  is  useful  in  using  a  single  Transputer  to  create  complex  programs 

destined  for  multiple  processors.  Figure  2.11  shows  the  equivalence  of  single  and 

multiprocessor  code,  and  the  relationships  between  internal  and  external  channels. 


\^   Process 
I I   Processor 


Figure  2.11.  Single  And  Multiprocessor  Equivalence 

5.     Programming  Practices 

Significant  influence  of  program  performance  can  be  achieved  by  use  of  priority 
in  parallel  constructs.  Correct  use  of  prioritization  is  especially  important  for  most 
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distributed  processes  communicating  via  links  and  dependent  upon  prompt  data  flow  for 
timely  program  execution.  If  a  message  is  transmitted  to  a  Transputer  and  requires 
throughrouting,  as  in  a  hypercube  topology,  it  is  essential  that  the  Transputer  input  the 
message  then  output  it  with  minimum  delay  so  that  another  Transputer  somewhere  in  the 
system  would  not  be  unnecessarily  delayed  waiting  for  the  message.  [Ref  10:p.  13]. 

Therefore,  all  processes  which  use  links  (external  channels  vice  internal 
channels)  should  be  run  at  high  priority.  Processes  which  perform  calculations  only  or  a 
minimum  of  communications  should  be  run  at  low  priority  to  preclude  degrading  the 
efficiency  of  those  processes  performing  external  communication  and  subsequently  the 
efficiency  of  the  network  as  a  whole. 
6.     Programming  Environment 

The  Occam  compiler  is  packaged  by  INMOS  within  the  Transputer  Develop- 
ment System  (TDS),  which  provides  compilers,  libraries,  a  debugger,  a  unique  text  editor 
and  other  facilities. 

Code  is  entered  within  a  "fold"  editor:  a  text  editor  in  which  any  number  of  lines 
can  be  "folded"  or  replaced  with  a  user  defined  header  line.  By  using  folds  to  condense 
blocks  of  code,  the  entire  program  may  always  be  viewed  on  just  one  screen  without 
having  to  page-up  or  page-down,  or  requiring  a  multiple  window  environment  to  view 
different  blocks  of  code  simultaneously.  A  fold  header  line  is  identified  by  three  leading 
periods  followed  by  the  fold  title,  for  example:  "...  fold  title".  Folds  may  be  nested  to 
any  level,  and  if  selected  wisely,  can  show  the  overall  structure  of  a  program  with  the 
details  hidden  within  the  folds.  To  see  the  details  the  fold  is  entered  and  only  the 
contents  of  that  fold  are  seen.  Optionally,  the  text  surrounding  the  fold  may  remain 
visible,  but  will  usually  be  off  screen  depending  upon  the  size  of  the  fold  contents. 
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The  fold  editor  and  compiler  work  together  as  there  are  several  specialized  types 
of  folds.  A  PROGRAM  fold  contains  code  intended  to  be  booted  and  run  on  a  network 
of  Transputers  external  to  the  development  board  (described  in  Chapter  III.)  located  in  a 
desktop  personal  computer  (I*C).  Executable  (EXE)  folds  contain  code  for  the  develop- 
ment board  in  the  PC.  Both  PROGRAM  and  EXE  folds  may  contain  separately 
compiled,  or  SC  folds  to  break  up  the  code  into  stand  alone  units  which  may  be  compiled 
and  modified  separately  without  affecting  others  thus  reducing  recompile  time  for  minor 
changes.  Library  (LIB)  folds  are  also  available  either  predefmed  or  created  by  the  user  to 
contain  global  information  which  is  referenced  within  appropriate  PRCXjRAM,  SC,  or 
EXE  folds.  Finally,  any  fold  may  be  temporarily  removed  from  compiler  action  by 
nesting  it  within  a  COMMENT  fold.  [Ref.  18:pp.  13-22,40,84-86] 

Transputer  Development  System  version  D700D  was  used  for  this  project. 
Version  D  included,  among  other  utilities,  a  debugger  for  error  tracing.  The  debugger 
was  useful  when  used  on  code  written  for  the  B004  development  board.  However,  as 
may  be  expected,  tracing  errors  in  an  external  network  of  Transputers  is  difficult. 
Complete  details  of  the  TDS  can  be  found  in  [Ref  18]. 
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III.    MESSAGE    EXCHANGE    HARDWARE 

The  following  sections  discuss  in  detail  the  equipment  used  in  the  implementation  of 
the  message  exchange.  A  complete  diagram  showing  major  component  relationships  is 
provided  at  the  end  of  this  chapter. 

A.    C004  LINK  CROSSBAR 
1.     General 

The  IMS  C004  is  a  programmable  32-way  crossbar  switch  that  supports  the 
INMOS  link  protocol  to  provide  synchronized  communication  between  similarly 
equipped  components.  A  crossbar  can  be  thought  of  as  a  matrix  of  switches  which 
permit,  in  the  case  of  the  0004,  any  one  of  the  32  inputs  to  be  directly  connected  to  any 
one  of  the  32  outputs.  Figure  3.1  shows  a  block  diagram  of  an  IMS  C004. 
[Ref9:pp.  367-368] 
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This  ability  to  manipulate  32  connections  serves  to  greatly  expand  the  flexibility 
of  the  Transputer  which  typically  contains  four  link  engines.  (Some  specialized  Trans- 
puters have  only  two  links,  and  new  models  are  planned  with  eight).  In  the  normal 
situation,  therefore,  a  Transputer  may  be  directly  connected  to  a  maximum  of  four  other 
Transputers  or  link-equipped  devices.  If  the  applications  requires  that  more  than  four 
Transputers  be  logically  connected,  intermediate  Transputers  must  be  programmed  to 
pass  data  along  to  the  desired  destination,  but  only  after  incurring  delays  and  additional 
overhead  as  compared  to  a  direct  route.  Use  of  a  program  controlled  crossbar  precludes 
this  problem  by  allowing  one  (or  more)  links  from  the  Transputer  to  be  directly 
connected  to  as  many  as  32  different  links.  Thirty-two  Transputers  can  be  completely 
configured  using  two  IMS  CCKMs  [Ref.  19:p.  13].  For  applications  requiring  even  greater 
flexibility,  arbitrarily  large  crossbars  can  be  constructed  by  cascading  multiple  CX)04s  as 
described  in  [Ref  19:p.  8]. 

2.     Hardware  Description 

The  C004  is  packaged  in  an  84  pin  grid  array  chip  and  internally  organized  as  a 
set  of  32-to-l  multiplexers.  Each  multiplexer  has  associated  with  it  a  six  bit  latch,  five 
bits  of  which  select  one  of  32  inputs  as  the  source  of  data  for  the  corresponding  output. 
The  sixth  bit  is  used  to  connect  or  disconnect  the  output.  When  this  last  bit  is  set  high  the 
link  is  considered  to  be  active.  These  latches  can  be  read  and  written  by  messages  sent 
on  the  configuration  link  via  ConfigLinkln  and  ConfigLinkOut.  Figure  3.2  shows  the 
C004  hardware  implementation.  [Ref  9:p.  377] 

A  reset  input  is  also  provided  to  clear  all  latches  so  that  no  connections  are 
active.  CapPlus  and  CapMinus  are  connections  for  an  external  capacitor,  and  LinkSpeed 
sets  the  transfer  rate  through  the  crossbar  at  either  ten  or  20  Megabits  per  second. 
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Clockln  provides  input  from  a  five  MHz  clock  used  by  the  C004  for  internal  component 
timing.  Recall,  however,  that  input  communication  synchronization  is  not  dependent 
upon  Clockln  or  its  phase.  [Ref  9:pp.  368-372] 
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Although  C004  crossbars  can  be  cascaded  in  multiple  connections  to  any  depth 
without  loss  of  signal,  each  crossbar  does  introduce  an  average  1.75  bit  signal  delay 
which  arises  when  the  output  of  each  multiplexer  is  synchronized  with  an  internal  high 
speed  clock  and  regenerated  at  the  output  buffer  [Ref  19:p.  1].  This  delay  becomes 
significant  when  more  than  two  C004s  are  connected  in  series  due  to  the  link  protocol 
described  in  Chapter  II,  In  the  TSOO  Transputer,  maximum  transmission  rates  are 
accomplished  by  transmitting  the  acknowledge  packet  to  the  sender  as  soon  as  the  first 
few  bits  of  the  11  bit  data  packet  is  received.  Therefore,  the  sender  will  be  able  to  send 
the  next  data  packet  without  delay  since  it  would  have  already  received  the  acknowledge 
for  the  current  data  packet.  Communications  through  a  series  of  C(X)4s  will  cause  a 
signal  delay  of  approximately  four  to  five  bits,  which  in  turn  causes  a  gap  to  occur 
between  consecutive  data  packets  since  the  sender  must  wait  for  the  arrival  of  the 
acknowledge.  A  single  C004  inserted  between  two  linked  Transputers  which  fully 
implement  overlapped  acknowledges  causes  no  reduction  in  bandwidth  [Ref  19:p.  5]. 
3.     Software  Commands 

Commands  and  responses  are  transmitted  to  the  C004  via  the  ConfigLinkOut 
and  ConfigLinkI n  lines  from  a  controlling  Transputer.  All  communications  are  in  sets  of 
one,  two  or  three  bytes.  Byte  values  zero  through  31  are  used  to  reference  a  crossbar  link 
number.  Bytes  values  zero  through  six  are  used  to  specify  one  of  seven  commands  listed 
in  Table  3.1  [Ref  19:p.  3].  If  a  message  of  an  unspecified  configuration  is  sent  to  the 
C004  the  effect  is  undefined. 
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TABLE  3.1 
C004  CROSSBAR  CONTROL  COMMANDS 


CONFIGURATION 
COMMAND  [Bytes] 

COMMAND  FUNCTION 

[0]    [input]    [output] 

Connects  specified  input  to  output  (unidirectional). 

[1]    [linkl]    [link2] 

Connects  linkl  to  link!  by  connecting  the  input  of  link!  to  the 
output  of  link2  and  the  input  of  Unk2  to  the  output  of  linkl. 

[2]   [output] 

Enquires  which  input  the  output  is  connected  to.  The  C004 
responds  with  the  input  link  byte  designation.  The  most 
significant  bit  of  this  byte  indicates  whether  or  not  the  link  is 
actually  connected  (bit  set  high)  or  disconnected  (bit  set  low). 

[3] 

Essentially  a  pause  command  which  should  be  sent  immedi- 
ately after  a  connection  is  made  to  ensure  the  links  are  ready 
to  accept  data.  Necessary  only  if  the  connection  will  be  used 
immediately  upon  established. 

[4] 

Resets  all  Unks  on  the  C004  to  disconnected  status. 

[5]    [output] 

Specified  output  is  disconnected  and  held  low. 

[6]    [linkl]    [Imk2] 

Disconnects  the  output  of  linkl  and  the  output  oi  linkl. 

B.    B012  TRANSPUTER  MODULE  MOTHERBOARD 

The  IMS  BO  12  is  a  module  motherboard  of  standard  eurocard  design  (220  mm  by 
233.5  mm)  which  contains  a  collection  of  fixed  and  removable  components.  Two  C004 
crossbars  (designated  IC2  and  IC3,  discussed  in  the  following  section)  and  a  T212 
Transputer  (ICl)  are  installed  on  the  motherboard,  and  16-pin  slots  are  provided  for  a 
maximum  of  16  removable  Transputer  Modules  or  TRAMS.  TRAMS  are  available  in  a 
variety  of  configurations  and  sizes,  some  requiring  more  than  one  slot  thus  reducing  the 
number  of  TRAMS  allowed  on  the  board.  According  to  Watson  [Ref  20:p.  2],  the 
design  goals  of  the  module  motherboards  with  crossbars  are: 

•  To  be  able  to  build  systems  with  any  number  of  Transputer  modules  in  any 
combination  or  type  or  size; 

•  To  be  able  to  build  a  variety  of  different  kinds  of  networks  (e.g.,  arrays,  trees, 

cubes,  etc.); 
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Enable  any  number  of  motherboards  to  be  chained  together; 

Make  Transputer  link  connections  configurable  by  software; 

To  be  able  to  run  test  and  applications  programs  on  Transputers  without  first 
configuring  links; 

Provide  a  standard  hardware  interface  to  configuration  and  applications  software; 

Make  the  Transputer  hardware  independent  of  the  host  system 
Two  96-pin  edge  connectors,  PI  and  P2,  provide  connections  between  the  BO  12 
components  and  other  Transputers,  as  well  as  power  supply  to  the  B012.  The  TRAM  slot 
and  link  edge  connector  arrangements  are  shown  in  Figure  3~3. 
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Figure  3.3.  IMS  B012  Slot  And  Edge  Connector  Arrangement 

1.     Motherboard  Configuration 

Connecting  pins  for  each  slot  provides  power,  reset  and  clock  signals  to  the 
TRAM  and  connects  the  links  of  the  TRAM  mounted  Transputer  to  the  motherboard.  In 
general,  TRAM  linkl's  and  link2's  are  connected  head  to  tail  in  a  bidirectional  pipe,  and 
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linkO's  and  linkS's  are  hardwired  to  the  C004's.  When  not  all  slots  are  populated  with 
TRAMs,  or  when  TRAMs  larger  than  one  slot  are  installed  jumpers  must  be  placed  in  the 
unused  connectors  to  maintain  connectivity  of  the  pipe.  [Ref.  21:p.  2] 

Some  flexibility  is  provided  by  jumper  block  Kl  which  can  break  or  rearrange 
the  pipe  into  other  configurations.  For  example,  a  four-by-four  node  matrix  can  be 
created  by  breaking  the  16  slot  hard- wired  pipe  into  four  segments  by  opening  the 
appropriate  Kl  jumpers,  then  completing  the  mesh  using  crossbar  connections.  Instead 
of  leaving  the  Kl  jumpers  open  as  in  the  mesh,  a  four-by-four  node  torus  can  be  created 
by  looping  each  four-Transputer  segment  into  a  ring,  and  making  additional  crossbar 
connections  to  loop  the  perpendicular  dimensions.  Variations  in  pipe  configuration  and 
C004  control  can  also  be  made  via  jumpers  on  the  P2  edge  connector.  Figure  3.4  shows 
the  16  slots  in  a  pipe  with  the  variable  connections  via  Kl  and  P2  identified.  A  specific 
pin-out  of  the  Kl  jumper  block  can  be  found  in  [Ref  21  :p.  36]. 
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Linkl  on  slotO  is  wired  to  P2  and  is  called  PipeHead,  and  link2  on  slot  15  is  also 
wired  to  P2  and  called  PipeTaii  By  properly  installing  the  jumpers  in  unused  slots 
PipeTail  will  then  connect  to  whichever  slot  is  fiUed  last  on  the  BO  12.  By  connecting  the 
pipe  heads  and  tails  from  multiple  B012s  together,  a  large  pipeline  of  TRAMs  can  be 
created. 

In  a  standard  configuration,  slotO  is  the  first  TRAM  in  the  pipe  and  its  linkO  and 
links  are  connected  to  the  C004s.  Multiple  B012s  may  be  connected  in  another  pipe 
involving  the  T212s.  However,  slotO  can  be  accessed  directiy  via  P2  and  Kl  instead  of 
the  C004s  for  specialized  applications  in  which  the  TRAM  in  slotO  is  used  to  control  the 
T212  or  the  remaining  TRAMs  on  the  B012.  Thus,  multiple  B012s  can  be  chained 
together  connected  via  the  Transputer  in  slotO  instead  of  the  T212s.  Normally,  each  linkO 
and  links  of  all  slots  are  connected  to  the  two  C004s  per  the  arrangement  shown  in 
Figure3.5.  [Ref21:pp.  5-6] 

As  shown,  the  link  output  signals  from  all  the  linkO's  on  all  slots  are  connected 
to  the  16  inputs  of  one  C004.  The  link  input  signals  from  all  the  linkS's  on  all  slots  are 
connected  to  16  outputs  of  the  same  C004.  The  remaining  16  inputs  and  16  outputs  of 
that  C004  are  connected  to  the  edge  connector  PI.  The  other  C(X)4  is  connected 
similarly,  except  that  16  of  its  inputs  are  connected  to  the  outputs  of  all  linkS's  on  all 
slots,  and  16  of  its  outputs  are  connected  to  the  inputs  of  all  linkO's  on  all  slots.  The 
remaining  inputs  and  outputs  are  connected  to  PI.  [Ref  21:pp.  5-8] 
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Figure  3.5.  Link  Organization  Between  B012  Slots  and  C004  Crossbars 

The  BO  12  link  switching  organization  using  C004s  does  not  allow  complete 
freedom  to  connect  any  link  to  any  other.  [Ref  21:p.  10]  The  result  of  this  connection 
scheme  does  allow  any  linkO  on  any  module  may  be  routed  via  the  C004s  to  any  link3  on 
any  module  in  just  one  programmed  connection  on  each  C004.  A  link  between  any  linkO 
to  another  linkO,  or  between  a  linkB  to  another  link3  (henceforth  defined  as  a  "like-link") 
may  also  be  established,  but  only  with  the  use  of  two  programmed  connections  on  each 
C004  and  two  appropriately  selected  link  cables  on  edge  connector  PI. 

Edge  connector  PI  has  three  columns  of  32  pins  each:  the  pins  in  one  column 
are  all  ground,  another  column  are  all  link  inputs,  and  the  third  are  all  link  outputs.  These 
32  connections  represent  half  of  all  available  connections  on  the  two  crossbars,  with  the 
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remainder  hardwired  to  the  slots.  Therefore,  each  row  of  three  pins  provides  a 
bidirectional  link,  and  the  32  rows  can  be  connected  as  the  user  desires  to  fulfil  the  needs 
of  the  application  at  hand.  The  32  rows  are  numbered  from  0  (at  the  top)  to  31.  Figure 
3.6  shows  the  connections  routed  to  PI  and  P2.  [Ref  21:pp.  9-10,26] 
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Figure  3.6.  B012  Edge  Connectors  PI  And  P2 
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A  module  motherboard  has  Up,  Down,  and  Subsystem  ports  on  the  P2  connector 
which  provide  hierarchical  control.  A  board  is  able  to  control  a  subsystem  of  other 
boards  by  connecting  its  Subsystem  port  to  the  Up  port  of  the  next  board.  By  connecting 
the  Down  port  of  the  first  subsystem  board  to  the  Up  port  of  the  next  a  daisy  chain  of 
boards  is  established.  At  any  point  down  the  chain,  a  new  subsystem  may  branch  by 
starting  with  a  Subsystem  port.  This  hierarchy  is  shown  in  Figure  3.7.  B012  architecture 
also  provides  flexibility  by  allowing  different  sources  of  reset  initiation  for  different 
components.  For  example,  if  there  are  n  slots  on  a  motherboard,  modules  in  slots  one 
through  n  may  be  controlled  from  either  the  Up  port  of  the  board  or  may  be  part  of  a 
subsystem  controlled  by  the  module  in  slotO.  [Ref  20:p.  9] 
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2.     T212  Transputer  Crossbar  Controller 

A  T212  Transputer  is  used  to  control  the  two  C004  crossbars  on  the  B012 
motherboard.  The  T212  has  four  links  of  which  linkO  and  linkS  are  connected  to  the 
C004s.  Linkl  and  link2  are  routed  to  the  P2  edge  connector,  labelled  ConfigUp  and 
ConfigDown  for  access  to  external  systems  or  routing  back  to  other  available  connections 
on  the  BO  12  such  as  Pipe  Head  and  PipeTail.  Configuration  data  for  the  C004  is  fed  into 
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one  of  the  T212's  links  (ConfigUp)  from  an  external  Transputer  or  firom  one  of  the 
TRAMs  on  the  BO  12.  Figure  3.8  shows  the  link  connections  from  the  T212 
[Ref21:pp.  10-12] 
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Figure  3.8.  T212  C004  Controller  Link  Organization 

The  T212  Transputer  follows  the  description  in  Chapter  II  above  on  the  T800 
with  some  key  exceptions.  The  T212  is  a  16  bit  processor  with  only  two  kilobytes  of 
on-chip  memory.  Although  the  T212  can  address  additional  memory  externally,  there  are 
no  provisions  on  the  BO  12  for  additional  memory.  (Another  Transputer  model,  the  T222, 
is  pin  compatible  with  the  T212  and  holds  four  kilobytes  of  on-chip  memory).  Due  to  the 
16  bit  architecture  a  maximum  of  62  kilobytes  of  external  memory  can  be  accessed 
[Ref  9:p.  318]  via  separate  16  bit  data  and  address  signal  paths. 

As  a  controller  for  the  C004s,  the  T212  is  intended  to  hold  a  small  program 
which  directs  the  connections  made  by  the  C004s.  This  program  can  run  indef)endently 
of  external  inputs,  or  can  receive  direction  from  other  Transputers  via  ConfigUp  and 
ConfigDown.  By  making  appropriate  jumpers  on  the  P2  edge  connector,  the  T212  can  be 
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included  in  the  pipe  formed  by  the  TRAMs  on  the  BO  12.  A  standard  configuration  for  a 
multiple  BO  12  board  system  is  to  form  a  daisy  chain  of  control  from  a  host  via  ConfigUp, 
and  connecting  subsequent  ConfigDown  to  ConfigUp  connections.  [Ref  21  :p.  12] 
3.    Transputer  Modules 

INMOS  Transputer  modules  (TRAMS)  are  designed  to  form  the  building  blocks 
of  parallel  processing  systems  in  which  more  than  the  on-chip  memory  is  required.  They 
consist  of  printed  circuit  boards  in  a  range  of  sizes  which  typically  hold  a  Transputer, 
some  memory,  and  perhaps  some  application  specific  circuitry.  A  TRAM  needs  only  a 
five  volt  power  supply,  ground  and  a  five  MHz  clock  to  operate.  These  are  supplied  to 
the  module  through  pins  connecting  the  TRAM  to  the  motherboard.  Other  pins  carry 
signals  for  the  Transputer's  links  and  reset,  analyze  and  error  signals.  [Ref  21:pp.  12-14] 

IMS  B401  is  the  designation  of  the  TRAMs  used  in  this  project.  Each  B401 
holds  a  T800  Transputer  and  32  kilobytes  of  static  random  access  memory  and  connects 
to  the  motherboard  using  one  slot  via  two  rows  of  eight  pins  each.  Sixteen  B401  TRAMs 
may  be  placed  on  one  BO  12  motherboard  for  a  total  of  16  Transputers  and  576  kilobytes 
of  memory.  Figure  3.9  shows  the  pin  arrangement  of  a  B401  TRAM.  Full  details  of  this 
particular  TRAM  can  be  found  in  INMOS  product  information  [Ref.  22]. 
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C.    B004  DEVELOPMENT  BOARD 
1.     General 

Transputer  components  form  a  unique  hardware  environment  which  is  not 
immediately  compatible  with  most  existing  personal  computers  (PC)  or  main  frames 
upon  which  development  work  is  accomplished.  The  IMS  B(X)4  development  board  was 
designed  to  meet  these  needs  by  interfacing  a  Transputer  memory  with  an  IBM  type 
personal  computer  allowing  the  software  developer  to  edit,  compile  and  test  software 
using  the  PC  as  a  host.  Figure  3.10  shows  a  block  diagram  of  the  B004  development 
board  which  fits  in  a  full  length  eight  bit  slot  of  an  IBM  compatible  PC.  [Ref.  23] 
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Figure  3.10.  IMS  B004  Transputer  Development  Board  Block  Diagram 

The  B004  contains  a  T414  Transputer,  two  megabytes  of  memory  and  circuitry 
to  connect  with  the  host  computer.  This  connection  is  made  via  one  of  the  Transputer's 
four  available  links  and  typically  through  linkO.  This  leaves  the  remaining  three  links  to 
connect  to  additional  Transputers  in  the  host  PC  or  in  other  locations.  When  the  B004  is 
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used  as  a  host  or  root  Transputer  for  a  larger  system,  compiled  code  is  booted  via  the 
B004  to  the  other  Transputers  along  link  connections.  The  TDS  may  prompt  the  user  for 
the  B004  link  number  being  used  to  boot  other  Transputers.  Once  the  first  link  is 
established,  the  TDS  relies  on  the  the  configuration  information  provided  in  the 
PROGRAM  fold  to  continue  routing  compiled  code  to  the  assigned  Transputers.  The 
order  of  external  Transputers  booted  follows  the  order  in  which  they  textually  appear  in 
the  program's  configuration  fold.  Each  Transputer  to  be  booted  is  first  reset  by  the 
host,  booted  with  its  assigned  compiled  code,  then  used  to  boot  the  next  Transputer  in  the 
system.  This  process  repeats  as  necessary,  following  configuration  fold  link  definitions, 
until  all  external  Transputers  are  booted. 

Backplane  connections  are  provided  for  link  hook-ups  and  subsystem  control 
{Up,  Down,  and  Subsystem)  as  discussed  for  the  B012.  Jumpers  are  also  attached  at  the 
backplane  to  connect  one  of  the  links  and  the  susbsystem  connection  to  the  host  PC.  In 
this  case,  the  host  PC  acts  as  the  master  controller  of  all  Transputers  in  the  network. 
Additional  Transputers  daisy  chained  to  the  B(X)4  are  controlled  hierarchically  via  the 
D6>u'/2  connection.  [Ref23:p.  10] 

B004  interface  to  the  host  PC  includes  access  to  the  PC's  resources  such  as 
monitor  and  mass  storage.  Depending  on  the  compiler  used,  coded  library  routines  may 
be  available  to  display  information  on  the  monitor  or  store  computed  results  on  the  hard 
disk.  Transputer  Development  System  D700  version  D  was  used  for  this  project  to  edit 
and  compile  the  Occam  program  code.  Extensive  use  of  library  calls  to  predefined 
display  subroutines  was  made  in  order  to  facilitate  monitoring  and  debugging  of  code. 
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2.     T414  Transputer 

The  B004  development  board  used  contained  a  single  T414  Transputer.  With 
few  exceptions,  most  of  the  information  contained  in  Chapter  n  above  applies  to  the 
T414  model.  Unlike  the  T800,  the  T414  does  not  include  an  on-chip  floating  point  unit 
and  only  holds  two  kilobytes  of  on-chip  memory.  [Ref  9:p.  179] 

T414  link  communication  protocols  are  the  same  as  TSOO,  however,  the  B004 
T414  does  not  send  the  ACK  packet  immediately  upon  receipt  of  the  first  few  bits  of 
incoming  data.  Instead,  the  B004  T414  waits  until  the  entire  data  packet  is  received 
before  the  ACK  is  sent.  Although  the  T414  does  support  the  high  speed  20  megabits  per 
second  data  transmission  rate,  other  components  on  the  B004,  particularly  the  PC 
interface  [Ref  23:p.  14],  restrict  the  T414  to  ten  megabits  per  second  link  speed.  Since 
the  T414  is  used  to  intercept  and  test  or  record  communications  from  the  T212  occurring 
along  the  link  controlling  pipe  of  the  message  exchange,  the  entire  system  data  flow  rate 
is  maintained  at  ten  megabits  per  second. 

D.    MESSAGE  EXCHANGE  HARDWARE  CONFIGURATION 

A  complete  view  of  the  message  exchange  hardware  is  shown  in  Figure  3.11.  This 
diagram  shows  the  relationships  between  the  key  components,  including  the  C004 
crossbars,  crossbar  controller,  TRAMs,  and  the  B004  used  in  monitoring  the  system. 
Four  TRAMs  in  slots  zero  through  three  are  shown  corresponding  to  the  hardware  used 
for  this  project.  The  numbers  on  the  left  and  right  sides  of  the  figure  show  the 
connections  between  the  edge  connector  PI  and  the  actual  link  designations  on  the 
C004s.  (The  numbers  under  "PI  #"  are  pin  row  numbers  on  the  edge  connector).  An 
arrow  accompanies  each  pair  of  numbers  indicating  its  communication  direction:  pointing 
towards  the  C(X)4  for  Linkin  and  away  from  the  C004  for  LinkOut. 
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Figure  3.11.  Message  Exchange  Hardware  Configuration 
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IV.    MESSAGE    EXCHANGE    FUNCTIONAL    DESCRIPTION 

A.    OVERVIEW 

Multiple  processors  connected  in  some  topology  can  be  described  as  belonging  to 
one  of  three  configuration  categories:  static,  semi-static,  and  dynamic  [Ref.  24:p.  124]. 
In  the  static  topology,  the  interprocessor  connections  are  fixed  for  the  duration  of  the 
application.  Any  significant  change  requires  system  down  time  and  hardware  modifica- 
tion. Semi-static  topologies  allow  for  system  reconfiguration  at  synchronized  points  as 
the  application  progresses.  For  example,  one  topology  may  be  best  for  loading  data,  then 
once  all  data  is  loaded,  all  processors  switch  to  a  second  topology  better  suited  for 
processing  the  data.  A  third  topology  may  later  be  used  to  output  or  display  the  results. 
The  Esprit  Project  P1085  [Ref  24]  discusses  a  large  scale,  semi-statically  reconfigurable 
Transputer  network  which  uses  crossbars  (72  x  72),  a  specialized  control  bus,  and  a  three 
layer  operating  system. 

Most  flexible  of  the  three  is  dynamic  configuration  in  which  any  processor  can  be 
dynamically  connected  to  any  other  processor  upon  request  while  the  application  is 
running.  Harp  [Ref  24:p.  124]  states  that  dynamic  switching  allows  dynamic  load 
balancing,  fault  tolerance  and  offers  potential  benefits  for  efficient  implementation  of 
high  level  languages.  Although  the  greatest  flexibility  can  be  achieved  with  complete 
interconnectivity,  dynamic  configuration  also  incurs  the  greatest  overhead  in  system 
maintenance. 

The  message  exchange  program  as  developed  here  is  intended  to  provide  maximum 
flexibility  in  data  transfer  within  a  dynamically  reconfigurable  Transputer  network.  As 
explained  earlier,  each  Transputer  has  four  links  with  which  to  communicate  to  similarly 
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equipped  components.  If  a  given  Transputer  requires  communication  to  more  than  four 
components,  data  must  be  routed  through  the  intermediaries  until  it  eventually  arrives  at 
the  intended  destination. 

The  message  exchange  developed  here  uses  the  CXX)4  crossbars  on  a  B012 
motherboard  to  establish  direct  connections  between  requesting  Transputers.  With  this 
hardware,  a  process  using  one  link  of  a  TRAM  can  be  directly  connected  to  any  of  the 
other  31  TRAM  links  on  a  fully  populated  B012  (that  is,  with  16  TRAMs).  Additional 
code  was  also  added  to  detect  and  recover  from  failed  external  links  between  crossbars 
used  in  making  these  connections. 

Hill  [Ref  191  served  as  the  basis  for  this  program,  however,  significant  modifications 
were  made  to  complete  the  partial  code  provided  and  adapt  it  the  hardware  design  of  the 
BO  12.  The  example  cited  was  designed  around  a  proposed  use  of  a  single  C004  and 
employed  all  Transputer  components  as  traffic  routers  for  external  processors.  The 
program  here  uses  both  C004s  of  the  BO  12  and  allows  application  code  to  be  embedded 
in  each  TRAM  for  production  work,  with  the  remaining  code  serving  as  a  communica- 
tions kernel  which  obtains  the  requested  connections  status. 

A  functional  description  of  the  message  exchange  in  both  normal  and  link  failure 
operations  is  discussed  below,  followed  by  a  detailed  explanation  of  the  code  in  the 
Controller  and  TRAM  modules.  A  complete  listing  of  all  message  exchange  code  is 
provided  in  Appendix  A.  Code  used  in  program  development  and  testing  is  provided  in 
Appendix  B.  Appendix  C  contains  code  for  a  visual  display  of  message  exchange 
activity  which  was  used  extensively  in  system  debugging. 
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B.    SYSTEM  OPERATIONS 

1.  Link  Control  Command  Description 

Communications  on  the  link  control  pipe  consist  of  three-byte  packets:  one  byte 
each  for  link  command,  source,  and  destination  [Ref  19:pp.  16-17].  There  are  six 
commands  issued  by  either  the  controller  or  any  of  the  managers  to  operate  the  crossbars 
and  thus  dynamically  control  the  network  tof)ology  and  recover  from  failed  links.  All 
commands  are  predefined  in  a  library  of  constant  definitions  made  available  to  the 
appropriate  procedures. 

•  REQ  (link  request)  issued  by  a  Manager  (process  on  a  TRAM  monitoring  link 
control  commands)  requesting  the  creation  of  a  new  link  for  one  of  its  two  Agents 
(processes  on  a  TRAM  performing  data  output)  as  a  source. 

•  ACK  (request  acknowledge)  issued  by  the  Controller  (process  on  T212  link 
crossbar  controller)  when  a  requested  link  is  successfully  established,  thus 
informing  the  Manager  the  new  link  is  available. 

•  REL  (link  release)  issued  by  a  Manager  when  an  existing  link  is  no  longer 
needed.  After  communications  are  completed  and  the  specified  path  no  longer  in 
use,  the  link  resources  should  be  freed  for  other  users. 

•  BRK  (release  acknowledged)  issued  by  the  Controller  when  the  link  has  been 
released.  This  allows  the  Manager  to  repeat  the  cycle  and  request  a  new  link  to 
the  next  desired  destination  as  necessary. 

•  ABT  (communications  abort)  issued  by  the  Manager  when  an  attempted 
outbound  communication  has  exceeded  a  watchdog  timer,  thus  signalling  a 
communication  failure. 

•  REI  (link  reinitialization)  issued  by  the  Controller  to  synchronize  reinitialization 
of  both  directions  of  both  ends  of  the  affected  link. 

2.  Link  Failure  And  Recovery  Procedures 

Under  routine  circumstances,  that  is,  using  only  the  Occam  "!"  and  "?" 
primitives  for  external  link  communications,  a  hardware  failure  of  the  external  link  used 
may  cause  the  Transputers  involved  not  only  to  lose  data,  but  also  to  hang  entirely 
causing  a  need  for  a  hardware  reset.  Transputer  deadlock  occurs  as  soon  as  a  data  packet 
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is  sent  along  a  link  and  its  corresponding  acknowledge  packet  is  never  received.  (Or 
conversely,  depending  upon  when  the  failure  took  place,  an  acknowledge  packet  is  sent 
and  the  next  data  packet  never  arrives).  Although  a  crossbar  can  be  employed  to  switch 
in  a  "good"  connection  to  replace  the  failed  wire,  the  Transputers  in  use  arc  still  stuck  as 
before  while  its  link  engine  is  waiting  for  a  transmitted  but  lost  packet  which  will  not  be 
regenerated  by  either  Transputer. 

Recovery  of  a  failed  link  and  the  Transputers  involved,  employs  Transputer 
Development  System  procedures  in  the  reinit  library.  These  library  procedures  imple- 
ment input  and  output  processes  in  lieu  of  the  standard  "!"  and  "?"  primitives,  and  can  be 
made  to  terminate  the  process  even  in  the  presence  of  a  communication  failure.  The  five 
library  procedures  and  their  associated  parameters  arc  as  follows:  [Ref  9:p.  271] 

•  PROC  InputOrFail.t   (CHAN  OF  ANY  c,     []BYTE  message, 

TIMER  time,  VAL  INT  t, 
BOOL  aborted) 

•  PROC  OutputOrFail.t  (CHAN  OF  ANY  c,  []BYTE  message, 

TIMER  time,  VAL  INT  t, 
BOOL  aborted) 

•  PROC  InputOrFail.c   (CHAN  OF  ANY  c,  []BYTE  message, 

CHAN  OF  INT  kill,  BOOL  aborted) 

•  PROC  OutputOrFail.c  (CHAN  OF  ANY  c,  []BYTE  message, 

CHAN  OF  INT  t,  BOOL  aborted) 

•  PROC  Reinitialise    (CHAN  OF  ANY  c) 

Input/OutputOrFail.t  procedures  take  the  following  parameters:  the  data 
channel  c  which  is  to  be  guarded  for  faults,  the  actual  data  variable  message,  a  timer 
channel  time  and  a  delay  value  t  in  clock  ticks.  The  procedures  return  the  boolean  value 
aborted  equal  to  true  if  the  communication  failed  to  complete  in  the  allowed  time 
interval  t.  If  the  communications  are  successful,  the  data  is  passed  and  aborted  is  set  to 
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false.  In  either  case,  although  the  communications  may  fail,  the  library  procedure  always 
successfully  terminates,  thus  protecting  the  system  from  failure.  By  using  a  timer  to 
determine  successful  communication  completion,  these  procedures  are  used  to  detect  a 
failure  and  thus  iniriate  recovery  actions.  Communication  failure  detection  functions  are 
assigned  to  the  Agent  process  which  is  responsible  for  conducting  data  output  on  each 
TRAM  in  the  message  exchange.  Figure  4.1  shows  a  state  diagram  tracing  through  the 
sequence  of  control  commands  involved  in  both  normal  and  fault  conditions. 
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Figure  4.1.  Link  Control  Command  State  Diagram 

In  situations  where  a  watchdog  timer  is  not  sufficient  to  detect  communications 
failure,  but  where  the  failure  can  be  detected  by  other  means,  the  Input/OutputOrFail.c 
procedures  can  be  used.  Instead  of  reporting  a  failure  based  on  a  timer,  this  pair  of 
procedures  takes  an  additional  external  signal,  an  integer  kill,  to  force  the  current 
communications  to  terminate.  Since  the  data  output  routines  in  the  Agent  processes 
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detect  and  report  a  link  failure,  the  communications  input  of  the  User  processes  use  these 
external  "kill"  procedures  to  terminate  a  failed  communication  based  on  link  control 
signals  from  the  appropriate  Agent  via  the  link  control  pipe.  By  relying  upon  only  the 
communication  source  (output)  end  of  the  link  to  detect  link  failures,  timing  difficulties 
and  redundant  fault  signals  inherent  in  the  message  exchange  system  arc  avoided. 

Finally,  the  Reinitialise  procedure,  which  takes  as  its  only  parameter  the 
communication  channel  c  involved,  actually  resets  the  failed  link  engine  allowing  them  to 
be  reused.  Reinitialise  should  be  called  only  after  both  directions  of  both  link  engines 
are  terminated  (by  the  above  procedures)  and  no  further  communication  is  attempted  by 
them.  Before  a  reset  link  engine  is  tasked  with  conducting  communications,  the  crossbar 
controlling  Transputer  should  have  assigned  a  working  connection  to  correct  the  fault. 
Repeated  failures,  however,  are  tolerable  and  the  sequence  of  events  in  link  recovery 
repeats  until  the  desired  communications  are  completed  successfully  using  a  working 
connection. 

3.     Link  Control  Command  Action  Summary 

Table  4.1  describes  the  life  cycle  of  the  various  controlling  commands  as  they 
are  passed  among  the  key  modules  in  the  pipe.  Origination  of  each  command  is  straight 
forward  and  not  flexible,  given  the  requirements  of  the  system.  However,  termination 
(removal  from  pipe)  of  commands  was  altered  several  times  during  software  develop- 
ment. In  all  cases  possible,  as  soon  as  the  command  has  travelled  along  the  pipe  to 
complete  all  its  functions,  the  last  station  taking  action  on  the  command  removes  it  from 
the  pipe.  When  several  components  take  action  on  a  single  command  (link  reinitializa- 
tion, for  example),  the  command  parameters  may  be  modified  allowing  the  terminal 
component  to  remove  the  command  without  fear  of  preventing  the  flow  of  needed  control 
information. 
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TABLE  4.1 
LINK  CONTROL  COMMANDS  AND  PROGRAM  MODULE  ACTIONS 


CMD 

Link. 

CONTROLLER 

MANAGER 

With  Source  Involved 

With  Dest  Involved 

REQ 

If  link  made: 
Link.REQ  Consumed, 
Link  ACK  Originates. 

If  link  not  made: 
Link.REQ  Recycled. 

Link,REQ  Originates 

prompted  by  Agent,  which 
is  itself  prompted  by  first 
byte  of  data  message  from 
User. 

Link.REQ  passed  on. 

ACK 

Link.REI  Originates. 
Link.REI  Consumed. 

Link.ACK  passed  on, 
source  byte  set  to  nil,  notify 
Agent  to  transmit  data. 
Link.ACK  Consumed  if 

destination  byte  already  nil. 

Link.ACK  passed  on, 

destination  byte  set  to  nil, 

notify  User  to  receive 

data. 

Link.ACK  Consumed  if 

source  byte  already  nil. 

REL 

If  link  broken: 
Link.REL  Consumed, 
Link.BRK  Originates. 

If  not  broken: 
Link.REL  Recycled. 

Link.REL  Originates 

based  on  request  from 
Agent  upon  successful 
completion  of  data  trans- 
mission. 

Link.REL  passed  on. 

BRK 

Link.BRK  Originates. 
Link.BRK  Consumed. 

Link.BRK  Consumed,  noti- 
fy Agent  link  is  broken. 
User  passes  next  data  block 
to  Agent. 

Link.BRK  passed  on. 

ABT 

Link.ABT  passed  on. 
Link.REI  Originates. 
If  new  link  made: 
Link. ACK  Originates. 
If  new  link  not  made: 
Link.REQ  Originates. 

Link.ABT  Originates 

based  on  notification  from 
Agent  of  failed  communi- 
cation. 

Link.ABT  Consumed  when 
command  makes  complete 
circuit,  ensuring  destination 
informed  of  failure. 

Link.ABT  passed  on, 
notify  User  to  abort  input 
of  data. 

Note:  destination  User 
notices  link  failure  only 
when  informed  via  the 
command  pipe.  Failure 
detected  by  source 
Agent. 

REI 

Link.REI  Originates. 
Link.REI  Consumed. 

Link.REI  passed  on,  source 
byte  set  to  nil,  notify  Agent 
to  reinitialize  link. 
Link.REI  Consumed  if 
destination  byte  already  nil. 

Link.REI  passed  on,  des- 
tination byte  set  to  nil, 
notify  User  to  reinitialize 
link.  Link.REI  Con- 
sumed if  source  byte  al- 
ready nil. 
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C.    CONTROLLER  MODULE 

The  Controller  code  executes  in  a  single  process  which  resides  entirely  on  the  T212 
C004  controller  installed  on  the  B012  motherboard.  In  the  configuration  established  for 
this  project,  the  T212  is  connected  in  a  unidirectional  pipe  formed  by  the  TRAMs  on  the 
motherboard.  Link  request  and  status  information  is  circulated  in  the  pipe  until  received 
by  the  Controller,  at  which  time  appropriate  action  is  taken  to  either  connect  or 
disconnect  a  requested  data  path,  or  to  recover  from  a  reported  failed  path.  The 
Controller  directs  the  crossbar  resources  of  the  B012  ensuring  that  all  requests  are 
fulfilled,  if  physically  possible,  and  avoiding  potential  system  deadlock  caused  by  partial 
allocation  of  resources. 

The  Controller  code  consists  of  a  main  section  which  monitors  the  link  control  pipe 
and  takes  action  by  calling  one  or  more  of  its  four  subprocedures,  while  also  managing 
connection  resources  and  status.  Subprocedure's  functions  will  be  discussed  individually 
followed  by  an  explanation  of  the  main  code  for  the  Controller. 

I.     Subprocedure:  reset.or.nil 

When  the  enquire  command  and  the  output  link  designation  of  interest  is  sent  to 
a  C004,  the  response  is  the  byte  designation  of  the  input  link  currently  connected  (see 
Table  3,1).  The  left  most  bit  of  the  response  byte  (bit  seven)  indicates  that  link  is  active 
when  set  high,  or  inactive  (disconnected)  when  set  low.  To  define  one  of  the  32  possible 
connections  to  any  given  output  link,  five  switches  must  be  set.  Their  status  is  indicated 
by  the  five  lower  bits  of  returned  byte  (bits  four  through  zero).  When  a  programmed 
connection  is  broken,  only  bit  seven  is  changed  (set  low);  the  selection  bits  are  altered 
only  when  a  new  link  is  established.  Bits  six  and  five  of  the  enquire  response  byte  are 
not  used. 
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Since  past  connections  to  a  given  link  are  of  no  interest  to  this  program,  the 
reset.or.nil  procedure  will  return  byte.nil  if  bit  seven  of  the  enquire  response  byte  is 
low,  thus  indicating  that  the  connection  is  inactive  and  free  for  assignment.  If  bit  seven  is 
high  (indicating  an  active  connection),  reset.or.nil  returns  the  current  connected  input 
link  designation  by  resetting  bit  seven  of  the  enquire  response  byte  and  leaving  bits  four 
through  zero  intact.  Setting  bit  seven  low  is  accomplished  using  the  Occam  "><"  bitwise 
exclusive-or  operator  on  the  hexadecimal  value  80  which  isolates  the  left  most  bit  of  the 
C004  enquire  response.  This  subprocedure  is  repeatedly  called  by  other  subprocedures  as 
well  as  the  main  code. 

2.     Subprocedure:  c4.cmd 

A  small  code  sequence  used  in  communicating  with  the  crossbars  is  used 

repeatedly  throughout  the  Controller  and  is  coded  as  a  subprocedure  for  clarity  and  to 

improve  memory  usage  efficiency  (recall  that  only  two  kilobytes  of  memory  are  available 

on  the  T212).  C4.cmd  sends  a  command  followed  by  zero,  one  or  two  parameters  (as 

appropriate  for  the  command:  see  Table  3.1)  to  a  selected  crossbar.  The  choice  of 

crossbar  command  to  implement  is  determined  by  the  value  of  the  parameters  received. 

Upon  completion,  the  returned  byte  is  placed  in  parameter  b3.  The  Occam  keyword 

VAL  coupled  with  other  type  identifiers,  defines  the  parameters  that  follow  to  be 

constants  for  the  duration  of  the  subprocedure. 

PROC  c4.cmd  (CHAN  OF  ANY  to.x,  from.x, 

VAL  BYTE  cmd,  bl,  b2,  BYTE  b3) 

If  the  command  involves  a  change  of  link  status,  c4.cmd  code  proceeds  to 
immediately  query  the  crossbar  and  verify  the  requested  status  change,  for  example:  that 
the  requested  connection  is  created  as  desired.  The  result  from  the  enquiry  is  then  run 
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through  the  reset.or.nil  subprocedure  which  transforms  the  crossbar  response  into  a  form 
indicative  of  the  task  accomplishment.  For  example,  if  a  link  was  to  be  broken  the  value 
returned  by  c4.cmd  (via  use  of  reset.or.nil)  would  be  byte.nil  indicating  a  free  link. 
3.     Subprocedure:  get.current.tie 

No  program  defined  tables  or  records  are  kept  by  the  Controller  with  regard  to 
link  connection  status.  When  a  connection  status  must  be  determined,  for  example,  to 
verify  a  link  is  free  prior  to  making  a  connection,  the  GXKs  are  consulted  directly.  As 
discussed  above,  the  enquire  command  is  used  to  establish  current  connections.  By 
sending  the  command  with  the  byte  designation  of  the  output  link  of  interest,  the  CXX)4 
will  respond  with  the  byte  designation  of  the  currendy  connected  input  link.  Reset.or.nil 
is  then  called  to  strip  the  high  bit  from  the  response  if  present,  or  change  to  byte.nil  if 
not. 

Since  the  five  selection  bits  (four  through  zero)  of  the  enquire  response  byte 
change  only  when  a  new  link  connection  is  ordered  or  when  the  entire  C004  is  reset, 
current  status  is  always  available  and  accurate.  By  relying  strictly  on  the  hardware 
switch  positions  internal  to  the  C004s,  potential  errors  in  status  table  maintenance  are 
completely  avoided.  Processor  effort  needed  for  C004  status  enquiry  is  comparable  to 
status  table  look-up  and  maintenance  since  all  Controller  code  is  sequential,  therefore  no 
competition  exists  for  use  of  the  links  between  the  T212  and  the  C004's. 

All  subprocedures  dealing  with  the  C004s  are  passed  four  channels:  to.cO, 
from.cO,  to.cl,  and  from.cl,  which  are  the  communication  paths  between  the  T212  and 
the  C004s.  The  get.current.tie  subprocedure  is  also  passed  the  source  or  destination  link 
of  interest,  and  it  returns  either  the  designation  of  the  current  connection,  or  byte.nil  if 
the  link  of  interest  is  inactive. 
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PROC  get. current. tie  (CHAN  OF  ANY  to.cO,  from.cO, 

to.cl,  from.cl, 
VAL  BYTE  in. link 
BYTE  link. tied. to) 

Initially,  get.current.tie  must  determine  which  CXX)4  to  enquire  since  linkO's 
and  links 's  are  connected  differently  (see  Figure  3.3).  This  is  complicated  by  the 
numbering  scheme  of  the  C004  links  and  the  slot  numbers  that  are  hardwired  to  them. 
Therefore,  a  translation  array  is  used  to  convert  the  CX)04  link  designation  to  a  slot 
number.  In  this  program,  linkO's  are  numbered  0  through  15,  and  linkB's  arc  numbered 
20  through  35. 

It  should  be  recognized  that  a  consistency  between  the  C004  wiring  schemes  is 
relied  upon  to  perform  these  operations.  Regarding  B012  organization,  Figure  3.3 
showed  that  for  a  given  TRAM  linkO  or  link3,  the  input  is  routed  to  one  C004  and  the 
output  to  the  other.  Although  different  C004s  are  involved,  the  C004  link  designation 
used  is  the  same.  This  is  valuable  since  the  enquire  command  responds  only  with  the 
input  link  connected  to  the  output  link  in  question.  There  is  no  enquire  to  determine  what 
output  is  connected  to  a  given  input.  Therefore,  the  connected  output  link  is  assumed  to 
be  the  same  link  designation  (although  different  C004s)  as  the  input  link.  This 
assumption  has  proved  reliable  since  all  connections  are  accomplished  for  both  input  and 
output  directions  in  tandem. 

4.     Subprocedure:  make.conn 

When  the  main  code  directs  the  establishment  of  a  connection,  the  make.conn 
subprocedure  is  called.  All  actions  necessary  to  complete  the  requested  connection  are 
contained  in  this  subprocedure.  Each  routine  includes  a  returned  boolean  value  conn.ok 
which  provides  connection  verification  results.  Make.conn  is  called  with  the  four  C004 
channels,  source  and  destination  bytes,  and  returns  only  the  boolean  conn.ok: 
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PROC  make. conn  (CHAN  OF  ANY  to.cO,  from.cO, 

to.cl^  from.cl, 
VAL  BYTE  in. source,  in.dest, 
BOOL  conn. ok) 

Three  distinct  cases  must  be  handled  by  this  prcx;edure:  connect  any  linkO  to  any 

link3,  connect  any  linkO  to  linkO,  and  connect  any  link3  to  link3.  The  first,  linkO  to  link3, 

is  straight  forward  since  only  one  connection  must  be  made  on  each  C004  and  the 

procedure  is  symmetric  for  each  link.  Figure  4.2  shows  how  the  crossbars  are  used  to 

connect  any  linkO  to  any  link3  on  the  BO  12.  Note  that  the  linkO  and  link3  involved  can 

even  be  of  the  same  Transputer.  After  enquiring  the  C004s  with  the  source  and 

destination  links  of  interest,  make.conn  orders  the  connection  if  both  links  involved  are 

free.  Otherwise,  the  procedure  returns  false  for  conn.ok.  The  remaining  two  cases, 

however,  are  much  more  complex  in  that  two  connections  on  each  C004  must  be  used  as 

well  as  an  external  edge  connector  jumper.  Since  like-link  cases  are  similar,  an  internal 

subprocedure  make.0033  is  used. 


C004-1 


C004-0 


Hardwired  Connections  Between  TRAMs  and  C004s 
Software  Controlled  Connection  in  C004  Crossbar 


Figure  4.2.  Crossbar  Connections  To  Connect  Any  LinkO  To  Any  Link3 
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Although  the  C004s  can  provide  immediate  status  of  all  internal  connections, 
there  is  no  direct  way  to  establish  placement  of  the  PI  edge  connector  jumpers  on  the 
BO  12.  Therefore,  the  user  supplies  a  table  listing  the  combinations  of  edge  connectors  to 
be  used  to  satisfy  either  linkO  to  linkO  or  link3  to  link3  requests.  Sixteen  jumpers  can  be 
arranged  on  the  PI  edge  connector  pins,  but  not  any  jumper  arrangement  will  satisfy 
either  type  of  connection.  Due  to  the  design  of  the  B012,  only  specific  combinations  may 
be  used  to  complete  a  like-link  connection.  Table  4.2  shows  one  selection  of  possible 
jumper  arrangements. 

For  this  project,  eight  jumpers  were  selected  for  linkO  to  linkO  connections  and 
eight  for  link3  to  link3.  The  user  provides  values  in  two  byte  arrays  edge.a  and  edge.b  in 
which  the  values  for  a  specified  index  form  a  pair  of  edge  connectors  defining  the 
location  of  a  jumper.  For  example,  edge.a[3]  and  edge.b[31  define  connections  25  and 
26,  respectively.  This  tells  make.0033  that  these  C(X)4  links  can  be  used  to  jumper 
across  from  one  C004  to  the  other  to  establish  a  connection  between  two  linkO's. 

TABLE  4.2 
JUMPER  PLACEMENT  FOR  C004  TO  C004  CONNECTIONS 


For  LinkO  To  LinkO: 

ForLink3  toLink3: 

Connect  PI  Row: 

To  PI  Row: 

Connect  PI  Row: 

To  PI  Row: 

2 

4 

10 

13 

17 

20 

28 

30 

3 

5 

11 

14 

18 

21 

29 

31 

0 

6 

8 

12 

16 

22 

24 

26 

1 

7 

9 

15 

19 

23 

25 

27 

The  user  also  provides  two  integer  values:  edgeO.ct  and  edge3.ct  which  are  the 
number  of  possible  jumpers  available  for  each  type  of  connection.  The  sum  of  edgeO.ct 
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and  edge3.ct  is  the  total  number  of  jumpers  installed.  The  first  edgeO.ct  listings  in  the 
two  byte  arrays  are  for  linkO  to  linkO  connections,  and  the  subsequent  edge3.ct  pairs  are 
for  link3  to  link3  connections. 

A  boolean  table,  edge.ok,  maintained  globally  in  the  Controller  module,  tracks 
reported  status  of  each  edge  connector.  When  a  TRAM  reports  an  edge  connector  has 
failed,  the  corresponding  index  in  edge.ok  is  set  to  false.  Only  edge  connectors  with  a 
corresponding  edge.ok  value  of  true  are  used  to  complete  like-links. 

With  the  jumper  information  provided,  the  make.0033  subprocedure  can 
proceed  to  establish  a  connection  between  two  like-links.  Figure  4.3  shows  the 
necessary  crossbar  and  jumper  connections  to  accomplish  this.  As  each  crossbar 
connection  is  made  the  procedure  immediately  enquires  the  status  of  the  assumed  newly 
formed  links  and  checks  the  returned  answer  against  what  is  expected.  Any  deviation  in 
comparing  expected  from  actual  status  is  treated  as  an  incomplete  connection  and  the 
make.conn  procedure  returns  FALSE  for  the  value  conn.ok. 
5.     Subprocedure:  break.conn 

Procedure  break.conn  works  similarly  to  make.conn  in  that  the  same  three 
cases  of  link  combinations  must  be  handled,  and  the  two  cases  involving  like-links  are 
handled  with  an  internal  subprocedure.  Significant  difference  is  that  the  CXX)4  command 
to  disconnect  links,  c4.disconn.output,  is  sent  instead  of  the  command  to  make 
connections.  Enquiries  of  link  status  are  still  conducted  and  the  returned  bytes  are  sent  to 
the  reset.or.nil  procedure.  Successful  termination  of  a  link  is  evidenced  by  all  links 
involved  showing  the  byte.nil  status  which  shows  that  the  links  are  now  free  for  their 
next  assignment. 
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Hardwired  Connections  Between  TRAMs  and  C004s 
Software  Controlled  Connection  in  C004  Crossbar 
Bidirectional  Link  Cable  on  Edge  Connector  (AB  to  AB) 


Figure  4.3.  Crossbar  And  Jumper  Connections  To  Connect  Two  Like-Links 

It  should  be  noted  again  that  there  are  no  link  status  tables  to  be  maintained  or 
updated.  In  all  cases,  link  status  is  obtained  directly  from  the  C004s  involved  to  ensure 
accurate  assignment  and  freeing  of  crossbar  connections. 
6.     Main  Controller  Code 

After  variable  initialization,  the  Controller  module  enters  an  infinite  loop  which 
constantly  receives  the  next  link  control  bytes  and  takes  the  appropriate  action.  Program 
structure  is  shown  below  followed  by  details  of  key  folds. 
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SEQ 

...initialize  C004s,  clock, 

and  edge. ok  status  array 
WHILE  TRUE 
ALT 

from. pipe  ?  token;  source;  dest 
IF 

token  =  link.req 

. . . action  for  connection  request 
token  =  link.rel 

. . .action  for  connection  release 
token  -  link.abt 

. . .action  for  failed  link 
token  -  link.ack 

. . . action  for  recycled  link  acknowledge 
token  =  link.brk 

. . .action  for  recycled  link  break 
token  =  link.rei 

...action  for  recycled  link  reinitialize 
otherwise 

...action  for  unrecognized  command 
(clock  interval  reached  and 
failed  request  count  limit  exceeded) 
...reset  a  edge . ok  status  to  TRUE 

a.     Action  For  Link  Request 

Upon  receiving  a  link  request,  the  status  of  the  requested  links  are 
determined  using  get.current.tie.  If  the  returned  status  for  both  source  and  destination 
are  byte.nil  (indicating  free  links),  the  requested  connection  is  established  using 
make.conn,  and  a  link  acknowledge  (link.ack)  packet  is  sent  to  the  pipe  to  inform  the 
TRAMs  involved  that  a  connection  is  made  and  data  can  be  transmitted.  If  a  requested 
link  is  currently  busy,  the  link  request  is  recycled  through  the  pipe  in  hopes  that  the 
resources  will  be  freed  for  reassignment  by  the  time  the  request  returns.  Using  the  link 
control  pipe  as  a  waiting  queue  for  failed  requests  ensures  that  the  request  will  not  be  lost 
while  it  is  waiting  for  resources,  and  also  minimizes  storage  and  processing  requirements 
of  the  Controller  code  on  the  T212. 
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If  the  link  request  could  not  be  fulfilled  due  to  lack  of  available  working 
"good"  edge  connectors,  a  counter  is  incremented  tracking  the  number  of  such  failed 
requests.  When  the  req.cnt  reaches  a  preset  value,  the  edge  connector  management 
algorithm  begins  searching  for  repaired  resources.  This  is  explained  in  more  detail 
below, 

b.  Action  For  Link  Release 

A  link  release  command  is  a  TRAM  request  that  an  existing  link  be  freed 
for  new  use  as  needed.  A  single  call  to  the  break.conn  terminates  the  connection  and 
frees  the  resources.  Although  there  is  no  resource  conflict  preventing  a  link  from  being 
disconnected,  if  the  crossbar  does  not  respond  with  the  expected  byte.nil  status,  link.rel 
will  be  recycled  through  the  pipe  and  break.conn  will  be  called  again.  Upon  successful 
termination  of  a  connection,  a  link  break  (link.brk)  packet  is  sent,  informing  all 
concerned  that  the  resources  are  free. 

Link  request  and  release  commands  are  the  first  two  choices  in  the  WHILE 
TRUE  loop  of  the  controller  due  to  their  frequency.  In  normal  conditions,  only  these  two 
commands  will  ever  reach  the  Controller  module  for  action. 

c.  Action  For  Link  Abort 

When  a  link  abort  (link.abt)  is  received,  the  command  is  immediately 
passed  on  to  the  pipe  since  additional  action  must  be  taken  by  other  TRAMS.  (The  abort 
command  is  eventually  removed  by  the  originating  TRAM).  Next,  the  Controller 
determines  the  exact  connection  and  its  edgelist  table  index  involved  via  the  get.cur- 
rent.tie  procedure.  To  prevent  a  known  bad  edge  connector  from  being  immediately 
reused  by  make.conn,  the  edge.ok  value  of  the  corresponding  index  is  set  to  false. 

With  the  failed  edge  connector  marked  as  bad,  the  Controller  continues  by 
sending  a  link  reinitialize  (link.rei)  to  the  pipe  so  that  both  directions  of  both  links 
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involved  in  the  aboned  communication  may  be  reset.  Note,  the  reinitialize  command  is 
sent  after  the  link  abort  command  to  ensure  that  all  communication  attempts  cease  prior 
to  completing  the  link  engine  reinitialization  process.  This  is  required  by  the  TDS  link 
engine  recovery  procedures.  At  this  point,  the  existing  failed  connection  is  formally 
broken  with  break.conn  and  remade  with  make.conn  using  a  different  path. 

From  here,  the  procedure  is  similar  to  that  of  the  link  request  action  above. 
If  the  connection  cannot  be  reestablished,  a  newly  generated  link  requested  is  placed  in 
the  pipe.  If  the  failure  to  make  a  new  connection  is  due  to  lack  of  edge  connector 
resources,  the  req.cnt  counter  is  incremented.  If  a  new  connection  is  made,  a  link 
acknowledge  is  placed  in  the  pipe  to  inform  the  TRAMs  involved  to  attempt  communica- 
tions again. 

d.  Action  For  Link  Acknowledge,  Break  And  Reinitialize. 

Under  normal  circumstances,  these  commands  are  removed  by  the  last 
TRAM  which  takes  action  based  on  their  receipt.  If  a  TRAM  fails  to  remove  such  a 
command,  they  will  be  removed  here  to  prevent  the  pipe  from  being  flooded  with 
meaningless  circulating  commands. 

An  additional  section  is  added  to  pass  noncommands  along  the  pipe.  This  is 
used  exclusively  for  testing  purposes  when  specialized  test  commands  were  placed  in  the 
pipe  to  monitor  specific  actions.  Such  testing  or  status  information,  like  TRAM  code 
completion,  is  removed  by  the  B(X)4  monitoring  the  system. 

e.  Edge  Connector  Status 

Also  within  the  WHILE  TRUE  loop  is  a  block  of  code  which  manages  the 
status  of  the  edge  connector  links.  Initially,  Controller  is  told  by  the  programmer  which 
connections  are  hard  wired  and  that  their  status  is  "good",  that  is,  the  edge  connectors  as 
listed  in   the  data   library   edgelist  are  immediately  available  for  use   to  establish 
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connecrions  between  like-links  of  TRAMS.  If  a  TRAM  detects  and  reports  a  failure  as 
the  message  exchange  proceeds  to  make  and  break  links,  the  Controller  marks  the  edge 
connector  involved  as  "bad"  and  avoids  assigning  it  in  the  future.  This  points  out  a 
sharing  of  responsibilities  between  the  Controller  and  TRAM  processors.  An  Agent  on  a 
TRAM  has  no  knowledge  of  the  routing  assigned  when  a  connection  is  requested  and 
acknowledged,  however,  it  will  delect  a  failure  if  the  connection  is  bad.  Conversely,  the 
Controller  which  made  the  requested  connection  has  routing  information,  but  cannot 
itself  determine  if  the  connection  is  actually  operational  without  being  told  by  the 
requesting  TRAM.  Similarly,  the  Controller  cannot  by  itself  determine  when  a  failed 
connection  has  been  repaired  without  first  assigning  it  to  a  TRAM  for  use.  Therefore, 
edge  connectors  previously  marked  as  "bad"  are  periodically  set  to  "good"  so  that  they 
may  be  assigned  and  tested.  A  timer-based  trigger  is  used  to  prompt  the  Controller  to 
reset  edge  connector  status. 

Immediately  within  the  WHILE  TRUE  loop  is  an  ALT  which  guards  both 
the  pipe  input  and  a  boolean  expression.  The  boolean  expression  evaluates  to  true  when  a 
predefined  time  period  has  elapsed  and  the  number  of  failed  link  requests  due  to  lack  of 
edge  connector  resources  has  exceeded  a  predefined  set  point.  Occam  keyword  AFTER 
is  used  to  measure  the  delay  from  clock  time  now  which  was  initialized  when  the 
program  began,  and  reset  when  this  expression  evaluates  to  true.  PLUS  is  a  modulo 
addition  operator  which  accurately  adds  clock  times  since  timers  recycle  from  maximum 
positive  number  to  zero.  The  "&"  is  Occam  boolean-and  which  requires  both  halves  of 
the  expression  to  be  true  for  the  entire  expression  to  evaluate  to  true.  Under  normal 
conditions  when  no  edge  connector  faults  are  reported,  the  expression  never  evaluates  to 
true  since  req.cnt  never  reaches  the  trip  point  (currently  set  at  the  number  of  TRAMs  in 
the  system).  When  many  link  failures  are  present,  the  timer  delay  prevents  the  expression 
from  constantly  evaluating  to  true  and  thus  consuming  too  much  CPU  time. 
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(req.cnt  >  no. trams)  &  cloclc  AFTER  now  PLUS  delay 
When  the  combined  expression  evaluates  to  true,  it  signals  a  need  to  look 
for  new  edge  connector  resources.  If  too  many  edge  connectors  are  marked  as  bad,  the 
system  slows  appropriately  since  the  remaining  connectors  must  be  used  for  all  like-link 
connections  causing  many  link  requests  to  be  recycled  as  they  wait  their  turn.  If  repairs 
are  made  to  the  bad  wires,  the  Controller  will  not  realize  the  status  change  unless  the 
connection  is  reused  by  the  TRAMs.  Therefore,  when  many  link  requests  fail  due  to  lack 
of  resources,  the  Controller  looks  for  newly  repaired  edge  connectors  by  periodically 
resetting  their  status  to  "good",  thus  making  them  available  for  use  by  make.conn  as  the 
need  arises.  If  the  connection  is  still  bad,  the  status  is  reset  accordingly.  If  the 
connection  has  been  repaired,  the  status  remains  good  until  it  fails  again. 

D.    TRAM  MODULE 

Code  on  each  TRAM  is  organized  into  five  processes  running  in  parallel  and 
communicating  via  internal  channels.  Figure  4.4  shows  the  assignment  of  both  internal 
and  external  channels  to  the  TRAM  processes  with  the  configuration  level  link  names 
written  outside  the  processor  box  and  local  channel  names  written  on  the  inside.  Note  the 
division  of  the  data  link  directions:  data  output  is  the  responsibility  of  the  Agent  process 
and  data  input  is  received  by  the  User  process.  Although  each  data  link  has  its  own 
Agent,  it  was  not  necessary  to  separate  the  User  processes  between  the  two  links.  Since 
the  application  code  resides  in  User,  the  User  processes  could  be  combined  into  one 
process  or  connected  via  an  internal  channel  so  that  the  given  application  could  have 
simultaneous  use  of  both  links.  However,  User  processes  were  kept  separate  in  this 
project  to  allow  independent  testing  of  both  data  links  on  each  TRAM. 
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Figure  4.4.  TRAM  Code  Process  Organization 

1.     Manager  Process 

Dedicated  to  managing  the  data  link  resources  of  the  message  exchange  for  a 
single  TRAM,  the  Manager  is  connected  via  external  links  to  the  link  control  pipe,  and 
via  internal  channels  to  both  Agent  and  User  processes  on  the  TRAM.  Since  the  code  for 
the  TRAM  processes  is  replicated  on  each  processor,  two  constants,  slot.desig  and 
no.trams,  are  passed  to  the  Manager  along  with  the  channels.  Slot.desig  tells  each 
TRAM  which  processor  it  is  corresponding  to  the  slot  number  in  which  it  resides  on  the 
BO  12.  The  slot  number  is  also  used  to  describe  the  data  links:  linkO  of  a  given  TRAM 
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equals  the  slot.desig,  and  link3  equals  slot.desig+20.  No.trams  is  a  global  constant 
telling  the  system  how  many  processors  are  installed  on  the  B012,  thus  allowing  crossbar 
connections  only  to  existing  processors.  Manager  procedure  call  is  as  follows: 

PROC  manager  (CHAN  OF  ANY  to. pipe,  from. pipe, 

mgr . to.agentO,  mgr . from. agentO, 
mgr. to. agents,  mgr. from. agents, 
mgr. from.userO,  mgr . from.user3, 
CHAN  OF  INT  mgr . to.userO,  mgr . to.user3, 
VAL  INT  slot.desig,  no. trams) 

Manager  consists  of  an  infinite  loop  monitoring  all  channel  inputs,  and  a  three 
stage  control  pipe  output  buffer  running  in  parallel  with  the  infinite  loop.  The  buffer 
itself  consists  of  three  parallel  processes,  each  as  one  stage  inputting  a  three  byte  link 
command,  source,  and  destination,  then  passing  it  to  the  next  stage.  The  middle  stage  is  a 
replicated  parallel  process  allowing  for  easy  increase  in  buffer  size,  however,  testing 
showed  that  minimizing  buffer  size  improved  performance,  and  the  smallest  reliable  size 
is  three  stages  to  allow  room  for  continuous  flow  along  the  pipe. 

Immediately  within  the  WHILE  TRUE  loop  is  an  ALT  which  guards  the  three 
possible  direct  inputs  to  Manager,  from  either  Agent  or  from  the  pipe.  Each  input  is 
described  separately  below. 

a.     Action  For  Input  From  Agent 

Either  Agent  communicates  the  needs  of  the  application  program  via  a 
single  byte  along  the  mgr.from.agentO  or  mgr.from.agent3  channels.  The  byte  re- 
ceived, one  of  several  predefined  byte  constants  for  internal  communication  and 
contained  in  the  intcmds  library,  is  analyzed  for  further  action.  If  the  Agent  signals  that 
it  is  finished  using  an  established  link,  Manager  immediately  sends  a  link  release 
command  to  the  pipe  output  buffer.  Source  and  destination  parameters  with  the  link 
release  are  generated  by  the  Manager  since  it  knows  which  Agent  is  notifying  completion 
(thus  its  link  designation)  and  the  destination  to  which  it  is  still  attached. 
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Similarly,  if  the  Agent  reports  a  failed  communication  the  Manager 
generates  a  link  abort  command  along  with  source  and  destination  byte  and  sends  the 
command  to  the  output  buffer.  For  testing  purposes,  an  additional  coinmand  was  added 
allowing  an  Agent  to  report  its  completion  of  data  communications,  thus  allowing 
performance  timers  to  record  processing  time. 

Finally,  if  the  Manager  receives  a  byte  other  than  one  of  the  predefined 
commands,  the  integer  translation  of  that  byte  is  interpreted  to  be  a  request  for  a 
connection  to  that  link.  Manager  immediately  sends  a  link  request  command,  source  and 
destination  to  the  output  buffer,  and  records  the  destination  for  use  in  generating 
subsequent  commands. 

b.     Action  For  Input  From  Link  Command  Pipe 

If  neither  Agent  is  communicating  with  the  Manager,  a  three  byte  link 
command  will  be  taken  from  the  pipe  input  when  present.  The  command  byte  is  anlyzed 
to  determine  action  to  be  taken,  if  any.  A  link  request  or  release  is  simply  passed  along 
since  it  has  not  yet  reached  its  final  destination;  the  Controller. 

When  a  link  acknowledge  is  received,  the  Manager  determine  its  relevance 
to  this  TRAM  by  checking  the  source  and  destination  bytes.  If  the  destination  byte 
matches  a  local  link,  that  User  is  informed  to  expect  receipt  of  data.  The  destination  byte 
is  then  set  to  byte.nil  for  future  reference  and  removal  by  a  subsequent  TRAM.  If  the 
source  byte  is  local  then  the  appropriate  Agent  is  informed  that  a  requested  link  is 
complete  and  to  begin  transmitting.  Likewise,  the  source  byte  is  then  set  to  byte.nil. 
Finally,  both  source  and  destination  are  checked  for  content.  If  both  are  byte.nil,  the 
entire  three  byte  command  is  removed  from  the  pipe.  Note:  this  occurs  even  if  both 
source  and  destination  are  local  links,  that  is,  the  TRAMs  linkO  and  link3  are  connected. 
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Receipt  of  a  link  break  is  relevant  to  a  TRAM  only  if  the  source  byte  is 
local.  (The  destination  code  in  the  User  module  which  receives  data  input  always 
monitors  its  channel,  and  need  not  be  notified  of  connection  creation  or  termination.) 
First,  the  local  variable  holding  the  address  of  the  destination  is  set  to  byte.nil  to  update 
current  status  of  connections.  Secondly,  the  appropriate  Agent  is  informed  that  the 
previous  connection  has  been  terminated  and  a  new  connection  may  be  requested  if 
desired.  If  either  local  link  is  involved,  the  command  is  consumed  here,  otherwise  it  is 
passed  along  the  control  pipe. 

If  a  link  abort  command  is  received,  the  command  is  immediately  passed  on 
if  neither  local  link  involved.  Since  aborts  are  originated  by  the  source,  the  command  is 
removed  by  the  source  upon  its  return,  thus  ensuring  that  both  the  Controller  and  the 
affected  destination  have  seen  the  abort  message.  If  a  local  link  is  the  cited  destination, 
the  appropriate  User  is  informed  of  the  failed  link  by  passing  an  integer  along  the  channel 
used  by  the  InputOrFail.c  procedure  to  trigger  a  forced  termination.  A  link  reinitializa- 
tion will  then  be  received,  which  informs  the  Agent  and/or  the  User  involved  to 
reinitialize  the  affected  link  engine  by  executing  the  Reinitialise  procedure.  As  with  the 
acknowledge  command,  link  bytes  corresponding  to  local  links  are  set  to  byte.nil  before 
being  passed  on.  If  both  source  and  destination  are  byte.nil,  indicating  the  command  has 
served  its  purpose,  the  command  is  removed  from  the  pipe. 

Finally,  other  bytes  passed  along  for  testing  or  troubleshooting  purposes  are 
passed  along  without  action,  to  be  removed  by  either  the  Controller  or  by  system 
monitoring  code  on  the  B004. 
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2.     Agent  Process 

The  name  Agent  was  chosen  for  this  process  to  signify  its  role:  to  act  on  behalf 
of  the  User  in  obtaining  a  connection  and  sending  data.  Once  the  Agent  has  instructions 
from  the  User,  the  User  proceeds  with  the  application  while  the  Agent  waits  for  the 
connection,  passes  the  data  block  or  recovers  from  a  failed  link. 

Like  the  Manager,  the  Agent  uses  several  internal  channels  to  communicate  with 

the  other  TRAM  processes,  but  only  one  external  channel  which  outputs  data  to  the 

destination  via  crossbars.  Agent  also  receives  a  constant  integer  denoting  its  link 

designation.  Header  code  for  AgentO  is  shown;  code  for  AgentS  is  identical  after 

changing  the  "0"  identification  to  "3". 

PROC  agentO  (CHAN  OF  ANY  mgr . to.agentO, 

mgr . from . agentO ,  agentO . to . userO , 
agentO . from. userO,  linkO.out, 
VAL  INT  link.desig) 

Upon  commencing  execution,  the  Agent  informs  its  User  that  it  is  ready  to 

receive  data  for  outputting.  Agent  then  enters  an  infinite  loop  which  monitors  input  with 

a  PRI  ALT  from  two  sources:  Manager  or  its  respective  User.  The  sequential  actions 

from  both  code  blocks  is  interleaved  by  message  passing  via  internal  channels.  After 

initialization,  the  Agent  sequentially:  (1)  receives  data  and  destination  from  User,  (2) 

requests  a  link  connection  via  Manager,  (3)  receives  notification  of  link  connection  and 

(4)  passes  data.  If  the  connection  is  bad,  the  Agent:  (1)  informs  Manager  of  a  problem 

and  recovers,  (2)  receives  notification  of  link  termination,  (3)  and  finally  informs  User 

that  it  is  ready  to  receive  the  next  data  and  destination  block.  Note  that  the  requested 

destination  address  is  embedded  as  the  first  byte  in  the  data  byte  received  from  the  User. 

Use  of  the  reinitialization  procedures  requires  that  data  be  passed  as  a  block  of  bytes. 

However,  Occam  contains  numerous  retyping  operators  to  convert  a  collection  of  bytes  to 

and  from  other  data  types  such  as  integers,  reals,  etc. 
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In  the  event  of  failed  communication,  the  OutputOrFail.t  procedure  returns  a 

boolean  true  in  the  aborted  variable  when  the  watchdog  timer  expires.  This  causes 

program  execution  to  enter  an  IF  construct  which:  (1)  informs  the  Manager  of  link 

failure,  (2)  receives  return  confirmation  then  (3)  proceeds  to  reinitialize  the  failed  link.  If 

the  communication  terminates  without  failure,  the  Manager  is  informed  of  the  request 

that  the  connection  be  terminated. 

3.     User  Process 

The  third  member  process  on  each  TRAM  is  the  User,  which  completes  the 

relationships  and  interactions  described  above.  User  code  also  contains  all  application 

code  to  be  executed  in  the  message  exchange.  For  this  analysis,  dummy  code  was 

inserted  to  simulate  a  load  on  each  TRAM  processor  and  the  resulting  influence  on 

system  efficiency.  That  is,  as  the  User  module  consumes  more  effort  processing  data, 

fewer  requests  for  communications  will  be  issued  in  a  given  time  period.  Conversely, 

User  workload  influences  the  performance  of  the  other  modules  on  the  TRAM  by  taking 

more  time  slices  to  complete  processing. 

User  header  code  follows  the  patterns  above  with  internal  channels,  one 

external  data  input  channel,  and  two  constant  integers. 

PROC  userO  (CHAN  OF  ANY  agentO . to.userO, 

agent 0 , from . userO ,  mgr . from . userO , 
linkO . in, 
CHAN  OF  INT  mgr . to .userO, 
VAL  INT  link.desig,  no. trams) 

User  code  consists  of  two  WHILE  TRUE  blocks  running  in  parallel.  The  first 
block  is  sequential  code  which  manages  data  input  similar  to  the  output  code  used  in  the 
Agent.  However,  the  User  fault  tolerant  routines  rely  on  external  notification  of  a  fault  to 
terminate  a  failed  communication  by  using  the  InputOrFail.c  procedures.  This  notifica- 
tion is  received  from  the  Manager  via  the  integer  channel  mgr.to.user.  Accordingly,  the 
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input  link  must  allow  for  reinitialization  with  or  without  data  being  received  along  the 
link  at  the  time  of  the  failure.  (Recall  that  in  order  to  recover  from  a  link  fault,  both  input 
and  output  link  engines  on  both  ends  of  communications  must  be  reset).  Therefore, 
depending  upon  the  command  received  from  Manager,  the  link  may  be  reinitialized  at 
different  points  in  the  code. 

Application  code  is  running  in  the  second  parallel  block,  which  repeats  a 
predefined  number  of  times  for  testing  purposes,  each  repetition  generating  and  sending  a 
new  data  block.  This  includes  manipulation  of  data  received,  generation  of  new  data, 
calculations  and  determination  of  destination  to  receive  data.  For  this  analysis  various 
routines  were  used  in  different  test  runs  to  select  only  like-links  or  unlike-links,  as  well  as 
simulating  different  degrees  of  load  to  be  processed  on  the  TRAM.  A  random  number 
generator  with  a  seed  based  on  processor  clock  time  was  used  to  select  the  next 
destination  when  complete  randomness  was  tested.  Random  numbers  were  generated 
until  an  acceptable  link  designation  was  established,  which  was  then  sent  as  a  link 
request. 

4.     TRAM  Code  Configuration 

To  establish  priority  grouping  of  the  five  TRAM  processes,  the  configuration 
code  consists  of  a  PRI  PAR  with  a  high  priority  PAR  block  nested  within.  The  high 
priority  block  includes  the  processes  which  are  crucial  to  external  channel  communica- 
tion: Manager  and  Agents.  The  low  priority  block  for  applications  contains  both  User 
processes.  Although  this  may  appear  at  first  glance  to  be  counterproductive,  significant 
gains  are  made  in  system  performance  by  ensuring  link  control  functions  are  conducted 
expeditiously.  With  different  priorities,  application  code  may  wait  unnecessarily  while 
requested  links  are  being  created  and  terminated  if  there  are  excessive  delays  along  the 
command  pipe  and  output  data  links. 
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PRI  PAR 
PAR 

manager  () 

agentO  () 

agent3  () 
userO  0 
user3  () 

Processor  configuration  code  also  includes  declaration  of  all  internal  channels 
used  between  the  processes.  Placement  of  channel  names  within  parameter  listings  for 
each  process  connects  the  correct  processes  together.  Also,  integer  constants  are  created 
in  the  configuration  parameters.  When  assigning  link  designations  to  link3  processes,  20 
is  added  to  the  slot.desig  value  received  by  the  TRAM  from  global  configuration  code. 
5.     Global  Configuration 

At  the  global  configuration  level,  all  external  links  are  assigned  to  processor 
code,  and  processor  code  assigned  to  processors.  As  mentioned  before,  the  TRAM  code 
is  replicated  to  each  T800  processor  on  the  BO  12.  Code  replication  is  performed  in  the 
global  configuration  section. 

Two  major  blocks  of  code  must  be  assigned  to  two  different  types  of  processors. 
Controller  code  is  assigned  to  the  crossbar  controlling  T212  Transputer,  and  TRAM  code 
is  assigned  to  one  or  more  T800s  on  the  B012  motherboard.  (Code  for  the  monitor  code 
on  the  B004  and  its  T414  is  handled  entirely  separate  from  the  message  exchange  code  in 
an  EXE  fold).  Transputer  type  is  signified  by  the  keyword  T2  for  T212  Transputer  and 
T8  for  a  T800.  This  lets  the  compiler  know  which  libraries  are  to  be  used  in  creating 
compiled  code  for  each  processor.  To  assign  links,  a  set  of  constants  is  defined  in  the 
library  links  which  also  contains  the  constant  no.trams.  Link  assignment  matches  the 
processor  code  parameter  with  the  hardware  link  engine  on  each  Transputer.  Controller 
code  header  shown  below  is  preceded  by  a  one-to-one  matching  of  software  channel 
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name  with  hardware  link  component  with  the  PLACE  ...  AT  construct.  The  channel 
names  are  then  matched  to  the  procedure  header  definition  in  the  local  Controller  code 
previously  defined. 


PLACED    PAR 

PROCESSOR  no. trains  T2 
PLACE  pipe[0] 
PLACE  pipe[no.trams+l] 
PLACE  t2.to.c0 
PLACE  t2.from.c0 
PLACE  t2.to.cl 
PLACE  t2.from.cl 


AT  lin]c2in 

AT  linklout 

AT  linlcOout 

AT  linkOin 

AT  linkSout 

AT  links in 


controller  (pipe[0],  pipe [no. trams+1] ^ 
t2.to.c0,  t2.from.c0, 
t2.to.cl,  t2.from.cl,  no. trams) 

Similarly,  the  TRAM  code  parameters  are  assigned  to  hardware  links,  but  a 

replication  process  is  used  to  assign  the  same  code  with  slightly  different  parameters  to 

each  T800.  This  is  done  by  using  an  array  of  channels  and  carefully  matching  the  correct 

array  index  with  the  replicated  parameters.  For  example,  to  create  the  pipe  each  linklout 

of  a  given  index  value  [slot]  is  physically  connected  to  the  next  corresponding  linklin  of 

index  value  [slot+1].  This  technique  informs  the  software  of  the  actual  connections  made 

with  the  B012  hardware  in  a  single  statement  instead  of  having  to  itemize  the  links  of 

each  processor.  Each  processor  is  also  given  a  designation  based  on  the  incremented 

value  of  slot.  Processor  identification  numbers  are  100,  200,  300, ...  for  the  T800  in  slots 

zero,  one,  two,  etc.  Since  slot  varies  for  each  replication,  each  T800  processor  receives  a 

different  constant  value  for  its  slot  designation  and  subsequent  link  designations. 


78 


PLACED  PAR  Slot  =  0  FOR  (no. trams) 

PROCESSOR  ( (slot+l)*100)  T8 
PLACE  pipe [slot] 
PLACE  pipG[slot+l] 
PLACE  link0.from.cl[slot] 
PLACE  Iink0.to.c0[slot] 
PLACE  Iink3.from.c0[slot] 
PLACE  link3.to.cl[slot] 


AT  linklout 
AT  link2in 
AT  linkOin 
AT  linkOout 
AT  linkSin 
AT  link3out 


tram,  code  (pipe  [slot],  pipe  [slot -i-1] , 

linkO . from.cl [slot] ,  linkO.to.cO [slot] , 
links. from.cO [slot] ,  linkS .to.cl [slot] , 
slot,  no. trams) 

As  can  be  seen,  all  channel  names  used  are  declared  strictly  for  the  global 

configuration  section,  particularly  those  which  are  declared  as  arrays  of  channels. 

Correspondence  with  local  channel  declarations  of  the  procedures  themselves  is  achieved 

by  correct  placement  of  the  parameters  listed  with  the  procedure  call. 
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V.    MESSAGE    EXCHANGE    PERFORMANCE    EVALUATION 

Numerous  test  runs  of  the  message  exchange  were  conducted  with  variations  in 
program  structure  and  parameters.  In  general,  each  run  consisted  of  ICXX)  link  cycles 
(link  control  activity  necessary  to  establish  and  terminate  one  link)  performed 
back-to-back  by  each  User.  With  four  TRAMs  in  the  system,  a  total  of  8000  link  cycles 
were  completed  in  each  run.  The  destination  is  chosen  at  random  for  each  link  cycle.  In 
the  unrestricted  case,  a  User  chooses  from  a  set  of  seven  possible  destinations  for  each 
link  cycle  (excluding  itself  from  the  complete  set  of  eight  Users).  When  performance  of 
only  like-link  or  only  dislike-link  connections  was  studied,  the  User  chose  from  a  set  of 
three  possible  destinations.  All  selections  were  uniformly  distributed  across  the  possible 
destinations  available.  Measurements  include  total  time  to  completion,  number  of 
requests,  acknowledges,  breaks,  aborts,  and  reinitializes  needed.  For  various  tests  the 
system  was  run  either  loaded  (with  work  simulated  in  the  User  process  by  adding  a 
10,000  iteration  SKIP)  or  unloaded  and  thus  only  performing  link  control  and  data 
passing  activity. 

In  the  tables  below  each  measurement  is  the  average  of  five  runs  under  the  same 
conditions.  Individual  runs  vary  due  to  the  random  nature  of  the  link  destination 
requests,  however,  variations  between  runs  are  minor.  Faults  are  introduced  by  providing 
only  one  physical  crossbar  to  crossbar  edge  connector  for  each  like-link  connection:  one 
cable  for  linkO  to  linkO  connections  and  one  cable  for  link3  to  link3  connections.  In  all 
fault-test  cases  the  library  tables  incomecdy  state  that  all  16  edge  connector  cables  are 
available,  therefore,  faults  are  detected  when  the  nonexistent  cables  are  used.  The  two 
cables  actually  connected  are  the  last  two  in  the  list,  so  the  previous  14  must  be  identified 
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as  faulty  before  the  good  cables  are  found.  Additional  faults  are  caused  as  known  failed 
cables  are  arbitrarily  reset  in  hopes  of  finding  repaired  cables.  Also,  additional  link 
control  commands  are  needed  to  accommodate  simultaneous  like-link  requests  since  only 
one  can  be  fulfilled  at  a  time  with  the  other  request  being  recycled. 

A.    DATA  TRANSFER  VARIATIONS  AND  SYSTEM  EFFICIENCY 
1.     Data  Block  Size 

Significant  variations  in  system  performance  occurs  due  to  data  block  size.  Data 
block  size  must  be  defined  at  compile  time  due  to  the  requirements  of  the  link  failure  and 
recovery  library  procedures.  Once  selected,  the  data  block  size  remains  fixed  for  the 
duration  of  program  execution.  Therefore,  if  the  application  program  expects  to  transmit 
variable  length  data  between  nodes,  the  block  size  must  be  chosen  to  allow  the  maximum 
expected  data  length,  with  extra  bytes  in  shorter  messages  being  discarded.  Test 
measurements  were  collected  with  data  block  size  varying  from  two  to  8192  bytes  in 
factors  of  two.  The  upper  limit  of  8192  bytes  was  reached  due  to  the  memory  available 
on  each  TRAM  (36  kilobytes  total  of  on  and  off-chip  RAM)  and  the  declaration  of  arrays 
for  both  output  and  input  data  blocks  by  each  user. 

Tables  5.1  and  5.2  provide  the  results  of  testing  the  unloaded  system  with  and 
without  faults,  and  Tables  5.3  and  5.4  with  simulated  system  load.  Time  to  complete  an 
8000  request  cycle  run  increases  with  data  block  size,  but  not  significantiy  until  block 
size  reaches  64  bytes  in  the  unloaded  runs  and  128  bytes  loaded.  The  relatively 
consistent  execution  times  for  smaller  data  block  sizes  represents  the  time  to  route  link 
control  information  and  perform  the  action  required  for  link  operations.  Actual  process- 
ing and  data  passing  performed  by  Users  easily  fit  within  the  gaps  formed  by  high 
priority  Agent  and  Manager  processes  concentrating  on  link  control. 
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Raw  data  measured  and  listed  below  includes  execution  time,  number  of 
requests  and  total  link  control  commands  used  as  averaged  over  five  runs  of  identical 
parameters.  Calculations  which  compare  number  of  data  bytes  take  into  account  the  data 
block  size  and  the  number  of  data  blocks  passed:  8000  in  all  runs.  Calculations  involving 
link  control  bytes  take  into  account  that  each  command  consists  of  three  bytes  and  the 
number  of  link  control  commands  issued  (either  requests  only  or  the  sum  of  all  six 
command  types,  as  specified).  Calculated  figures  listed  include  microsecond  per  data 
byte  passed,  average  number  of  requests  needed  for  the  per  data  byte  passed,  data  transfer 
rate  in  kilobytes  per  second  (inverse  of  microseconds  per  data  byte,  adjusted  for  change 
in  units),  a  ratio  of  data  bytes  to  control  bytes,  and  control  communications  overhead  as  a 
percentage  of  link  control  bytes  passed  of  the  total  bytes  passed  in  the  system. 

The  single  T212  link  controller  also  represents  a  limiting  factor,  particularly 
when  processing  a  request  for  a  like-link.  During  the  T212's  execution  of  sequential  link 
manipulation  code  no  additional  link  control  commands  are  received  causing  the  pipe  to 
fill.  As  each  TRAM's  buffer  fills,  no  more  control  commands  can  be  submitted  or 
received  by  the  Managers  on  subsequent  TRAMs.  This  ripple  effect  continues  and 
prevents  Agents  and  Users  from  proceeding  to  the  next  data  transmission.  Although  it 
may  be  possible  to  implement  the  T212  code  as  a  collection  of  parallel  processes  with 
one  being  an  input  buffer,  such  constructs  consume  considerable  amounts  of  memory. 
With  only  two  kilobytes  of  on-chip  RAM  and  no  off-chip  RAM  available,  every  effort 
was  made  to  minimize  T212  code  size. 
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TABLE  5.1 
SYSTEM  PERFORMANCE  DATA:  UNLOADED,  NO  FAULTS 


Data 

Time 

Number 

Total 

uSec 

REQs 

Data 

Data 

Comms 

Block 

(Sec) 

Of 

Link 

Per 

Per 

Xfer 

To 

Over- 

Size 

REQs 

Control 

Data 

Data 

Rate 

Control 

head 

Cmds 

Byte 

Byte 

KB/Sec 

Ratio 

T 
^ 

3.56 

24,266 

48,266 

222.65 

1.517 

4.4 

0.33 

90.05% 

4 

3.56 

24,287 

48,287 

111.24 

0.759 

8.8 

0.66 

81.91% 

8 

3.56 

24,269 

48,265 

55.63 

0.379 

17.6 

1.33 

69.35% 

16 

3.63 

24,788 

48,788 

28.32 

0.194 

34.5 

2.62 

53.35% 

32 

3.68 

25,329 

49,329 

14.36 

0.099 

68.0 

5.19 

36.63% 

64 

3.74 

26,322 

50,322 

7.31 

0.051 

133.6 

10.17 

22.77% 

128 

3.94 

28,468 

52,468 

3.85 

0.028 

253.6 

19.52 

13.32% 

256 

4.29 

32,286 

56,286 

2.10 

0.016 

466.2 

36.39 

7.62% 

512 

5.09 

38,330 

63,958 

1.24 

0.009 

786.3 

64.04 

4.47% 

1024 

7.29 

58,299 

82,299 

0.89 

0.007 

1097.1 

99.54 

2.93% 

2048 

11.90 

91,136 

114,736 

0.73 

0.006 

1344.1 

142.80 

2.06% 

4096 

21.41 

157,309 

181,309 

0.65 

0.005 

1494.5 

180.73 

1.63% 

8192 

40.44 

289,091 

313,094 

0.62 

0.004 

1582.8 

209.32 

1.41% 

TABLE  5.2 
SYSTEM  PERFORMANCE  DATA:  UNLOADED,  WFTH  FAULTS 


Data 

Time 

Number 

Total 

uSec 

REQs 

Data 

Data 

Comms 

Block 

(Sec) 

Of 

Link 

Per 

Per 

Xfer 

To 

Over- 

Size 

REQs 

Control 

Data 

Data 

Rate 

Control 

head 

Cmds 

Byte 

Byte 

KB/Sec 

Ratio 

2 

3.75 

24,993 

49,044 

234.21 

1.562 

4.2 

0.33 

90.19% 

4 

3.73 

25,107 

49,154 

116.66 

0.785 

8.4 

0.65 

82.17% 

8 

3.73 

24,872 

49,121 

58.31 

0.389 

16.7 

1.30 

69.72% 

16 

3.80 

25,583 

49,631 

29.70 

0.200 

32.9 

2.58 

53.77% 

32 

3.86 

26,258 

50,306 

15.09 

0.103 

64.7 

5.09 

37.09% 

64 

3.94 

27,355 

51,437 

7.70 

0.053 

126.8 

9.95 

23.16% 

128 

4.14 

29,247 

53,441 

4.04 

0.029 

241.4 

19.16 

13.54% 

256 

4.52 

33,452 

57,511 

2.01 

0.016 

442.1 

35.61 

7.77% 

512 

5.33 

41,725 

65,786 

1.30 

0.010 

750.4 

62.26 

4.60% 

1024 

7.62 

60,629 

84,699 

0.93 

0.007 

1050.3 

96.72 

3.01% 

2048 

12.06 

93,171 

117,245 

0.74 

0.006 

1327.2 

139.74 

2.10% 

4096 

22.08 

166,511 

190,620 

0.67 

0.005 

1449.1 

171.90 

1.72% 

8192 

42.01 

312,722 

334,880 

0.64 

0.005 

1523.4 

195.70 

1.51% 
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TABLE  53 
SYSTEM  PERFORMANCE  DATA:  LOADED,  NO  FAULTS 


Data 

Time 

Number 

Total 

uSec 

REQs 

Data 

Data 

Comms 

Block 

(Sec) 

Of 

Link 

Per 

Per 

Xfer 

To 

Over- 

Size 

REQs 

Control 

Data 

Data 

Rate 

Control 

head 

Cmds 

Byte 

Byte 

KB/Scc 
1.1 

Ratio 

2 

14.36 

8,941 

32,941 

897.35 

0.559 

0.49 

86.07% 

4 

14.36 

8,849 

32,849 

448.66 

0.277 

2.2 

0.97 

75.49% 

8 

14.37 

8,905 

32,905 

224.54 

0.139 

4.3 

1.95 

60.67% 

16 

14.41 

9,156 

33,156 

112.54 

0.072 

8.7 

3.86 

43.73% 

32 

14.45 

9,116 

33,116 

56.43 

0.036 

17.3 

7.73 

27.96% 

64 

14.55 

9,304 

33,304 

28.41 

0.018 

34.4 

15.37 

16.33% 

128 

14.74 

9,647 

33,647 

14.40 

0.009 

67.8 

30.43 

8.97% 

256 

15.16 

10,604 

34,604 

7.40 

0.005 

131.9 

59.18 

4.82% 

512 

16.09 

14,556 

38,556 

3.93 

0.004 

248.6 

106.24 

2.75% 

1024 

17.69 

17,180 

41,580 

2.16 

0.002 

452.1 

197.02 

1.50% 

2048 

21.70 

38,210 

62,210 

1.32 

0.002 

737.3 

263.37 

1.13% 

4096 

30.07 

85,155 

109,155 

0.92 

0.003 

1064.1 

300.20 

0.99% 

8192 

48.01 

201,155 

225,155 

0.73 

0.003 

1333.1 

291.07 

1.02% 

TABLE  5.4 
SYSTEM  PERFORMANCE  DATA:  LOADED,  WITH  FAULTS 


Data 

Time 

Number 

Total 

uSec 

REQs 

Data 

Data 

Comms 

Block 

(Sec) 

Of 

Link 

Per 

Per 

Xfer 

To 

Over- 

Size 

REQs 

Control 

Data 

Data 

Rate 

Control 

head 

Cmds 

Byte 

Byte 

KB/Sec 

Ratio 

2 

14.38 

9,236 

33,303 

898.88 

0.577 

1.1 

0.48 

86.20% 

4 

14.39 

9,201 

33,266 

449.58 

0.288 

2.2 

0.96 

75.72% 

8 

14.40 

9,180 

33,249 

224.93 

0.143 

4.3 

1.92 

60.92% 

16 

14.43 

9,239 

33,302 

112.74 

0.072 

8.7 

3.84 

43.84% 

32 

14.48 

9,439 

33,507 

56.54 

0.037 

17.3 

7.64 

28.20% 

64 

14.57 

9,510 

33,577 

28.45 

0.019 

34.3 

15.25 

16.44% 

128 

14.77 

9,926 

34,001 

14.4 

0.010 

67.7 

30.12 

9.06% 

256 

15.20 

11,122 

35,202 

7.42 

0.005 

131.6 

58.18 

4.90% 

512 

16.14 

15,246 

39,337 

3.94 

0.004 

247.8 

104.13 

2.80% 

1024 

17.83 

19,762 

43,254 

2.18 

0.002 

448.8 

189.39 

1.56% 

2048 

21.81 

39,840 

63,338 

1.33 

0.002 

733.6 

258.68 

1.15% 

4096 

30.44 

91,405 

115,533 

0.93 

0.003 

1051.3 

283.62 

1.05% 

8192 

49.29 

221,866 

246,043 

0.75 

0.003 

1298.4 

266.36 

1.11% 

84 


Also  of  concern  is  the  B004  and  T414  monitor  Transputer.  Excessive  work  by 
this  Transputer  would  also  have  a  slowing  influence  on  the  link  control  pipe.  Several 
variations  of  B004  code  were  used  with  widely  varying  degrees  of  complexity.  Timing 
differences  between  the  simplest  T414  code  and  the  monitor  program  used  were 
negligible.  Also,  both  input  and  output  buffers  were  added  as  part  of  the  T414  code  to 
help  smooth  its  presence  on  the  pipe. 

Figure  5.1  shows  the  effect  of  data  block  size  on  execution  time.  All  four 
combinations  of  load  and  fault  status  are  shown.  Note  the  minimal  difference  between 
the  runs  with  and  without  faults.  With  any  number  of  TRAMs  in  the  system,  the 
minimum  number  of  good  edge  connectors  needed  is  two;  one  for  each  type  of  like-link. 
Less  than  one  link  would  result  in  failure  to  pass  any  data  between  like-links,  thus 
causing  affected  processes  to  hang.  However,  with  four  TRAMs  the  minimum  needed  to 
obtain  fault  free  transmission  is  four,  two  per  like-link  type.  Therefore,  the  differences  in 
the  number  of  edge  connectors  in  these  runs  is  only  a  factor  of  two,  and  does  not 
significantly  stress  the  system  since  the  simultaneous  need  for  two  like-link  connections 
is  relatively  rare. 

As  data  block  size  increases  beyond  128  bytes  the  control  pipe  and  T212  are  no 
longer  bottlenecks.  Sufficient  time  is  spent  by  each  node  in  creating  and  passing  data  that 
link  control  commands  enter  the  pipe  with  minimal  restriction.  Consequently,  links  are 
terminated  in  a  timely  manner  allowing  the  next  request  for  those  resources  to  be  fulfilled 
in  the  first  attempt  without  recycling  the  link  request.  Worst  case  performance  occurs 
with  the  smallest  data  block  size  and  unloaded  User  processes.  Since  no  application 
work  is  done  by  unloaded  Users  the  performance  data  is  used  for  comparison  as  a 
baseline. 
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Figure  5.1.  Execution  Time  Versus  Data  Block  Size 

Tables  5.1  through  5.4  show  increases  in  data  communication  rate  with  block 
size,  but  the  rate  does  not  double  as  block  size.  Although  more  data  is  transferred  in 
larger  blocks,  the  system  must  work  harder  to  establish  the  requested  connections  since 
resources  are  held  longer.  Without  a  quick  turnover  of  resources  to  the  next  requesting 
node,  link  control  information  is  recycled  back  into  the  pipe  for  a  later  try.  The  number 
of  link  requests  increases  dramatically  with  block  size  showing  the  difficulty  in  achieving 
the  requested  connection.  Therefore,  greater  link  control  overhead  reduces  system 
efficiency  when  block  size  becomes  too  large. 
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One  way  to  show  this  is  to  compare  data  bytes  versus  control  bytes,  noting  that 
each  control  command  consists  of  three  bytes.  Figure  5.2  shows  the  data  byte-to-control 
byte  ratio  and  the  loss  of  communication  efficiency  encountered  when  data  block  size 
becomes  very  large.  In  the  unloaded  tests,  as  the  block  size  is  increased  beyond  1024 
bytes  the  data-to-control  ratio  (and  hence,  system  efficiency)  still  increases  but  less 
rapidly  than  with  block  sizes  below  1024  bytes.  As  data  block  size  is  increased  beyond 
4096  bytes,  the  data-to-control  ratio  actually  decreases.  TTie  loaded  case  shows  a  similar 
affect,  however,  since  fewer  control  communications  are  needed  the  decrease  in 
efficiency  will  take  place  at  higher  data  block  sizes  than  practical  to  test. 
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Figure  5.2.  Data-To-Control  Communication  Ratio 


87 


Efficiency  comparisons  using  control  and  data  communications  are  not  the  best 
indicator  of  performance  since  each  link's  DMA  engine  proceeds  with  only  minimal 
effort  of  the  system  CPUs.  Although  the  use  of  DMAs  is  especially  beneficial  in  passing 
large  data  blocks,  system  burden  increases  when  all  CPUs  in  the  control  pipe  are  passing 
small  commands.  Using  the  execution  time  of  the  unloaded  runs  with  small  data  block 
size  as  a  base  time  gives  a  better  comparison  of  system  overhead  resulting  from  link 
control  operations.  As  discussed  above,  no  significant  change  in  execution  time  occurs 
until  block  size  exceed  64  bytes.  Therefore,  execution  time  of  small  block  size  runs  is 
predominantly  a  measure  of  link  control  effort  of  the  system.  By  averaging  the  execution 
times  for  two,  four,  and  eight  byte  block  sizes  of  the  unloaded  runs  a  base  execution  time 
for  the  system  is  established. 

Figure  5.3  shows  a  ratio  of  base  execution  time  versus  execution  time  of  larger 
block  sizes  in  both  loaded  and  unloaded  runs.  Worst  case,  of  course,  is  small  block  size 
on  an  unloaded  system.  Control  overhead  drops  off  sharply  as  block  size  is  increased 
beyond  128  bytes.  In  the  loaded  system,  overhead  starts  at  about  25%  and  decreases 
slowly  as  block  size  increases  until  approaching  the  unloaded  overhead  at  a  about  8% 
with  a  8192  data  block  size.  Control  overhead  for  systems  with  load  less  than  the  test 
load  will  plot  between  the  test  load  and  the  unloaded  system  results.  Only  measurements 
of  runs  without  faults  are  shown  for  clarity. 

2.     Data  Routing  Via  Multiple  Crossbars 

Separating  each  TRAMs  data  links  into  individual  User  processes  allowed 
analysis  of  the  affects  of  different  routing  paths.  Due  to  the  wiring  arrangement  between 
the  crossbars  and  the  TRAMs,  the  methods  used  to  connect  like-links  is  significanUy 
more  complex  than  connecting  a  linkO  to  a  link3.  Additionally,  the  slowing  influence 
associated  with  multiple  C004  crossbars  is  a  concern  in  like-link  connections. 
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Figure  5.3.  System  Link  Control  Overhead 

Three  sets  of  runs  were  conducted  under  varying  constraints  limiting  the  routing 
assignments.  For  each  of  three  data  block  sizes:  8192,  1024,  and  128  bytes,  a  set  of  runs 
were  conducted  under  normal  routing  (uniformly  selected  from  all  seven  possible 
destinations),  with  only  like-link  routing  and  with  only  dislike-link  routing  (each 
uniformly  selected  from  the  appropriate  subset  consisting  of  three  possible  destinations). 
Table  5.5  below  shows  the  results. 

As  can  be  seen  from  the  data,  the  large  data  block  size  of  8192  blocks  causes  an 
increase  in  required  time  when  routing  is  restricted  to  only  like-link  (two  connections  per 
crossbar)  as  compared  to  the  case  in  which  routing  is  restricted  to  only  dislike-links  (one 
connection  per  crossbar).  This  result  is  not  as  pronounced  in  the  1024  byte  block  and 


89 


even  less  with  128  byte  block.  Figure  5.4  shows  a  comparison  of  the  three  link 
restrictions  and  three  data  block  sizes,  normalized  to  the  random  link  type  of  each  block 
size. 

TABLE  5^ 
LINK  ROUTING  ANALYSIS  RESULTS 


Parameter  Measured 

Link  Routing  Restriction 

MESSAGE  SIZE:  8192 

Oto3 
3  too 

RANDOM 

OtoO 
3  to  3 

Execution  Time  (seconds) 
Number  of  Requests 
Total  Control  Commands 
Time(usec)  /  Data  Byte 
Requests /Data  Byte 
Kilobytes  Data  /  Second 

45.706 
160,311 
184,311 
0.6974 
0.0024 
1400.3 

48.137 

203,743 

227,743 

0.7345 

0.0031 

1329.5 

49.023 

216,073 

239,473 

0.7480 

0.0033 

1305.5 

MESSAGE  SIZE:  1024 

Oto3 
3  too 

RANDOM 

OtoO 
3  to  3 

Execution  Time  (seconds) 
Number  of  Requests 
Total  Control  Commands 
Time(usec)  /  Data  Byte 
Requests /Data  Byte 
Kilobytes  Data  /  Second 

17.682 
16,625 
32,625 
2.1584 
0.0020 
452.4 

17.7562 
18,558 
34,558 
2.1675 
0.0023 
450.5 

17.846 
17,571 
33,571 
2.1785 
0.0021 
448.3 

MESSAGE  SIZE:  128 

Oto3 
3toO 

RANDOM 

OtoO 
3  to  3 

Execution  Time  (seconds) 
Number  of  Requests 
Total  Control  Commands 
Time(usec)  /  Data  Byte 
Requests  /  Data  Byte 
Kilobytes  Data  /  Second 

14.7788 
9,427 
25,389 
14.4324 
0.0092 
67.7 

14.7442 
9,617 
25,617 
14.3986 
0.0094 
67.8 

14.8594 
9,427 
25,427 
14.5111 
0.0092 
67.3 
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03-03         RAND         00-33 
—  1 28  Byte  Data  Block    — 


03-03         RAND         00-33 
— 1 024  Byle  Data  Block    — 


03-03         RAND         00-33 
— 8192  Byte  Data  Block     - 


Link  Routing  Type  And  Data  Block  Size 
■■  Execution  Time  ^S  Number  Of  Requests 

^  No.  Link  Commands  [^  KBytes{Data)/Sec 


Figure  5.4.  Performance  With  Communication  Type  Restrictions 

B.    TRAM  CODE  STRUCTURE 

Considerable  influence  upon  performance  can  result  from  slight  changes  in  an 
Occam  program  structure.  Timing  and  CPU  utilization  are  changed  when  different 
priorities  are  assigned  to  parallel  (PAR)  processes.  The  TRAM  code  is  subject  to  these 
considerations  and  various  configurations  and  their  effects  are  discussed  below. 

As  shown  earlier,  each  TRAM  contains  five  processes  executing  in  parallel  and 
communicating  with  each  other  and  with  other  processors  (see  Figure  3.7).  Atkin 
[Ref  10:p.  121  stressed  that  efficient  code  must  be  structured  to  separate  as  much  as 
practical  those  processes  performing  communications  from  those  performing  calculation. 
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This  is  accomplished  in  the  TRAM  by  placing  all  application  calculation  in  only  the  User 
modules.  When  data  is  to  be  passed  to  an  external  destination  the  data  is  passed  to  a  high 
priority  process  dedicated  to  outputting  that  block  along  a  link. 

Communications  processes  should  be  run  in  high  priority  to  help  ensure  that  the 
communications  themselves  do  not  become  a  bottieneck  in  holding  up  information  flow. 
I*rompt  initiation  of  communications  also  sets  in  action  the  DMA  link  engines  which  can 
then  release  the  processor  to  perform  other  work.  Given  this  guidance,  it  is  clear  the 
Manager  process  must  be  run  at  high  priority  to  handle  all  link  control  communications 
and  the  User  processes  must  be  run  at  low  priority  to  perform  application  work  and  data 
input.  Different  configurations  were  tested  adjusting  these  priorities.  Performance  of  the 
configurations  varied  widely;  some  resulted  in  nonfunctioning  programs  due  to  commu- 
nications starvation.  For  example,  running  the  User  processes  in  high  priority  and  the 
Agent  processes  at  low  priority  resulted  in  deadlock. 

Three  configurations  were  tested  and  compared  under  unloaded  and  loaded  condi- 
tions using  a  1024  data  block  size.  The  normal  case  consisted  of  the  process  priority  as 
listed  in  the  presented  code.  Case  I  and  Case  II  are  listed  below: 


Normal : 

PRI  PAR 

PAR  —  hi 
manager  () 
agentO  () 
agents  () 

--low 

userO  0 

user3  () 


Case  I: 

PRI  PAR 

manager  ()  —  hi 
agentO  ()   —  low 
agents  () 
userO  () 
user3  () 


Case  II: 

PAR  —  low 
manager  () 
agentO  () 
agents  () 
userO  0 
users  0 


Comparisons  of  the  three  cases  are  shown  in  Figure  5^.  Separate  tests  were 
performed  with  and  without  simulated  load  in  the  User  process.  All  figures  are 
percentages  normalized  against  the  appropriately  loaded  normal  case. 
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Figure  5.5.  TRAM  Code  Process  Prioritization  And  User  Load 

C.    LINK  FAULT  RECOVERY  PERFORMANCE 

When  a  link  fault  occurs  a  minimum  of  three  additional  control  commands  are 
issued:  link.abt,  link.rei  and  link.ack  (see  Figure  4.1).  If  a  replacement  route  is  not 
available  either  due  to  lack  of  "good"  edge  connectors  or  all  routes  are  busy,  a  link.req 
will  cycle  in  the  control  pipe  as  in  the  non-fault  case.  If  other  faulty  edge  connectors  are 
inadvertently  assigned  (having  not  yet  been  discovered  as  faulty),  the  abort  message 
chain  will  repeat  until  a  good  route  is  found  or  a  cycling  request  is  issued. 

After  the  Controller  has  determined  a  valid  status  for  all  16  edge  connectors  the 
abort  message  chain  should  occur  once  for  each  new  request  resulting  in  an  abort  due  to  a 
newly  created  fault.  However,  as  the  req.cnt  increments  when  the  Controller  realizes  a 
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resource  shortage,  previously  identified  failed  edge  connectors  are  reset  allowing  them  to 
be  reassigned  in  the  hopes  of  locating  repaired  cables.  Two  variables  control  the 
periodicity  of  resetting  edge  connector  status:  req.cnt  upper  limit  and  delay  time  value. 
For  this  project  the  req.cnt  upper  limit  was  set  equal  to  the  number  of  TRAMs  in  the 
system  (four)  and  the  value  for  the  delay  timer  was  set  to  16000  ticks  of  the  low  priority 
clock,  or  about  one  second.  Use  of  the  counter  prevents  unnecessarily  resetting  edge 
status  without  need,  and  the  timer  minimizes  repeated  resettings  when  the  need  for  new 
resources  is  great.  Both  values  can  be  adjusted  accordingly  to  adapt  system  performance 
to  the  application  implemented.  For  example,  a  shorter  time  delay  would  quicken  the 
Controller's  ability  to  find  newly  repaired  edge  connectors. 

Execution  time  affects  fault  recovery  actions  since  as  the  system  operates  longer 
under  fault  conditions,  more  previously  identified  faulty  edge  connectors  will  be  reset 
based  on  delay  time.  Data  block  size  does  not  directly  affect  the  degree  of  additional 
workload  resulting  from  the  presence  of  link  faults.  However,  in  the  test  cases  run  the 
use  of  larger  data  blocks  increases  the  total  amount  of  data  transferred,  thus  increasing 
total  execution  time  and,  consequently,  the  number  of  fault  recovery  communications. 
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TABLE  5.6 
PERFORMANCE  ASPECTS  OF  FAULT  DETECTION  AND  RECOVERY 


Data 

Block 

Size 

System  Unloaded 

System  Loaded 

No.  Of 

Aborts 

Added 

Total 

Comms 

Comms 

Per 
Abort 

No.  Of 
Aborts 

Added 

Total 

Comms 

Comms 

Per 

Abort 

2 

4 

8 

16 

32 

64 

128 

256 

512 

1024 

2048 

4096 

8192 

17.2 
15.4 
16.6 
15.8 
16.0 
18.2 
19.4 
19.4 
20.4 
23.2 
24.8 
36.2 
52.6 

778 

867 

856 

843 

977 

1,115 

974 

1,224 

1,828 

2,400 

2,509 

9,310 

21,786 

45.2 

56.3 

51.6 

53.4 

61.1 

61.2 

50.2 

63.1 

89.6 

103.4 

101.2 

257.2 

414.2 

22.2 
21.6 
22.8 
22.0 
22.6 
22.6 
25.0 
26.6 
29.8 
30.8 
32.6 
42.6 
59.0 

362 

417 

344 

146 

391 

273 

354 

598 

781 

1,674 

1,128 

6,378 

20,888 

16.3 
19.3 
15.1 
6.6 

17.3 

12.1 

14.1 

22.5 

26.2 

54.4 

34.6 

149.7 

354.0 
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VI.    CONCLUSIONS  AND  RECOMMENDATIONS 

A.    CONCLUSIONS 

By  using  specialized  though  off-the-shelf  hardware,  a  dynamically  reconfigurable 
network  of  Transputers  can  be  constructed.  The  Transputer's  unique  link  hardware 
makes  this  single-chip  processor  very  appropriate  for  this  type  of  system.  Efficiency  of 
the  message  exchange  presented  here  is  largely  determined  by  data  block  size  and  the 
application  load  on  each  node.  System  control  overhead  is  minimized  by  keeping  the 
link  control  commands  small  and  infrequent.  As  a  dynamically  reconfigurable  system, 
this  message  exchange  evaluated  the  worst  case  scenario  of  requesting  and  establishing  a 
new  connection  for  each  data  block  passed.  A  semi-static  configuration  system,  which 
reconfigures  the  network  topology  only  at  synchronized  points  during  application 
program  execution,  would  improve  system  efficiency  by  reducing  overhead  at  the 
expense  of  reconfiguration  flexibility. 

A  concept  implemented  in  hardware  should  have  superior  performance  over  the 
same  concept  implemented  in  software.  In  this  case,  use  of  program  controlled  crossbars 
to  directly  connect  communicating  nodes  should  perform  comparably  well  or  better  than 
a  packet  routing  system  which  passes  data  through  intermediaries.  However,  gains  made 
in  the  direct  routing  of  data  are  offset  by  losses  incurred  in  routing  crossbar  control 
information.  Utilization  of  a  link  control  pipe  to  pass  control  information  from  node  to 
node  diminishes  some  of  the  gains  achieved  from  direct  data  connection.  An  ideal 
correction  to  this  drawback  would  be  to  pass  link  control  information  point-to-point, 
comparable  to  the  manner  in  which  data  is  passed.  One  method  for  implementing 
complete  crossbar  connectivity  is  described  in  the  Esprit  F*roject  [Ref  24]  in  which  all 
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four  links  of  each  Transputer  are  connected  to  large  crossbars.  Thus  far  in  the  Esprit 
Project  semi- static  reconfiguration  is  implemented,  and  use  of  specialized  hardware 
allows  for  increased  expandability. 

Techniques  explored  in  this  paper  stress  the  circuit  switching  approach  made 
possible  with  readily  available  hardware,  including  program  controlled  crossbar  switches. 
Considerable  commercially  available  hardware  also  exists  employing  packet  routing 
methods,  most  notably  hypercube  parallel  processors.  Hypercube  topologies  maximize 
performance  by  symmetrically  interconnecting  nodes  to  minimize  the  distance  between 
any  pair  of  processors,  thus  reducing  packet  routing  overhead.  A  hypercube  topology 
contrasts  the  message  exchange  developed  here  by  substituting  crossbar  control  overhead 
with  packet  routing  overhead.  A  key  parameter  affecting  performance  in  both  topologies 
is  message  size,  which  is  determined  entirely  by  the  application  to  be  executed.  A 
performance  comparison  between  the  two  methods  should  be  conducted  by  executing 
identical  application  code  in  each  topology.  Since  the  number  of  nodes  in  a  hypercube  is 
equal  to  two  raised  to  the  nth  power,  where  n  is  the  number  of  links  available  to  each 
node,  a  16  node  hypercube  could  be  readily  constructed  using  Transputers.  This 
Transputer  hypercube  can  be  evaluated  against  a  16  node  message  exchange  implement- 
ed on  a  fully  populated  BO  12  motherboard. 

B.    RECOMMENDATIONS  FOR  FOLLOW-ON  WORK 

Integrated  shipboard  weapons  systems  of  the  complexity  found  in  AEGIS  consist  of 
a  collection  of  specialized  computers  distributed  throughout  the  weapon  platform  and 
communicating  with  each  other  as  necessary  to  detect,  identify,  track,  display  and  direct 
the  weapons'  fire  towards  targets.  Although  the  processing  power  is  distributed  among  a 
number  of  processors,  the  specialization  calls  for  discrete  elements  of  processing  power 
to  be  assigned  to  specific  tasks.  Communications  within  each  group  as  well  as  between 
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groups  need  not  be  completely  dynamic  nor  be  able  to  achieve  any  arbitrary  topology. 
However,  some  reconfiguration  ability  would  be  invaluable  as  the  system  posture 
changes  or  in  response  to  faults  occurring  in  the  system.  A  posture  change  of  the 
weapons  system  could  result  from  refinement  of  the  tactical  environment,  for  example,  as 
the  potential  source  of  the  current  threat  is  narrowed  to  subsurface  as  opposed  to 
airborne,  appropriate  computing  resources  can  be  directed  to  concentrate  accordingly. 

Given  this  dedicated  grouping  of  computing  resources,  an  architecture  similar  to  the 
BO  12  can  be  employed  at  each  specialized  station.  A  collection  of  Transputers  with 
dynamic  reconfiguration  could  adapt  to  system  load  changes  as  well  as  the  occurrence  of 
local  faults.  Interconnection  between  stations  can  be  achieved  by  using  switchable  links 
of  the  crossbars  accessible  at  the  edge  connectors.  A  hierachy  of  control  would  be 
established  between  stations  using  a  inter-B012  link  control  pipe  separate  from  the 
internal  pipe  developed  here.  Again  using  the  B012  as  an  example,  the  TRAM  in  slot 
zero  can  readily  be  disconnected  from  normal  C004  crossbar  connections,  thus  freeing  a 
significant  processing  resource  to  direct  interstation  communication  control. 

Figure  6.1  shows  an  arrangement  of  BO  12  stations  with  an  additional  link  control 
pipe  using  the  slotO  TRAM  of  each  station.  Data  connections  between  stations  can  be 
hardwired  between  the  remaining  crossbar  links  not  used  for  the  local  TRAMs. 
Assignment  of  these  external  links  can  be  weighted  in  favor  of  the  most  needed 
communications  routes,  with  additional  links  assigned  for  redundant  communications 
paths  thus  allowing  for  interstation  link  fault  tolerance  in  a  similar  manner  as  demonstrat- 
ed on  the  single  B012  in  this  project.  Interstation  link  fault  tolerance  would  become 
especially  valuable  when  considering  various  battle  damage  possibilities  throughout  the 
ship  as  well  as  equipment  failures. 
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Figure  6.1.  Multiple  B012  Connectivity  And  Control 

Link  control  communications  overhead  will  always  be  a  factor  limiting  efficiency, 
however,  if  reconfiguration  is  a  rare  event  then  control  signals  will  be  minimized  as 
compared  to  a  completely  dynamic  and  therefore  worst  case  system  in  terms  of  overhead. 
Also,  if  the  processing  power  of  a  given  station  must  be  increased,  or  if  more  external 
links  must  be  added,  two  or  more  BO  12s  can  be  arranged  head-to- tail  to  extend  the  local 
link  control  pipe  thus  increasing  the  number  of  local  TRAMS  as  well  as  crossbars. 
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Although  Transputers  are  an  extremely  capable  computer  component,  the  above 
discussion  should  not  be  taken  to  imply  that  a  complex  integrated  weapons  system  can  or 
should  be  implemented  in  the  off-the-shelf  technology  as  described  in  this  thesis. 
However,  simulations  and  modelling  of  system  functions  and  interactions  can  be 
implemented  as  described.  Link  connections  can  relay  simulated  sensor  information  as 
input  to  one  station,  environmental  and  real-world  simulation  to  another,  weapons 
response  to  a  third,  and  so  on.  Additional  stations  can  output  commands  to  weapons' 
control  as  well  as  tactical  displays  and  communications.  As  a  modelling  tool.  Transput- 
ers can  be  used  to  test  and  optimize  system  software  and  hardware  topologies,  and  to 
identify  and  test  the  specialized  hardware  necessary  to  meet  the  high  performance  needs 
of  a  modern  weapons  system. 
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APPENDIX    A 
MESSAGE  EXCHANGE  LIBRARIES  AND  SYSTEM  CODE 


—  DATA  LIBRARIES  used  in  B012  Message  Exchange 
--  All  libraries  stored  in  same  directory 


--  links  -  defines  Transputer  links  and  #  of  TRAMS  used 


VAL 

linkOout 

IS 

0 

VAL 

linklout 

IS 

1 

VAL 

link2out 

IS 

2 

VAL 

linkSout 

IS 

3 

VAL 

linkOin 

IS 

4 

VAL 

linklin 

IS 

5 

VAL 

link2in 

IS 

6 

VAL 

linkSin 

IS 

7 

VAL 

no . trams 

IS 

4 

--  standard  link  definitions 


—  number  of  TRAMS  on  B012 


--  c4cmds  -  defines  C004  crossbar  commands 


VAL 

C4 

, input . output 

IS 

0 

(BYTE) 

VAL 

c4 

.link 

IS 

1 

(BYTE) 

VAL 

C4 

. enquire 

IS 

2 

(BYTE) 

VAL 

c4 

setup 

IS 

3 

(BYTE) 

VAL 

c4 

. reset 

IS 

4 

(BYTE) 

VAL 

C4 

.disconn. 

output 

IS 

5 

(BYTE) 

VAL 

c4 

. disconn . 

link 

IS 

6 

(BYTE) 

--  ctrlcmds  -  defines  link  control  pipe  commands 


VAL 

link. 

req 

VAL 

link. 

ack 

VAL 

link 

rel 

VAL 

link 

brk 

VAL 

link 

abt 

VAL 

link 

rei 

IS 
IS 
IS 
IS 
IS 
IS 


40 
41 
42 
43 
44 
45 


(BYTE) 
(BYTE) 
(BYTE) 
(BYTE) 
(BYTE) 
(BYTE) 


--  link  actions 

for  control  pipe 


--  blcksize  -  defines  data  block  size  for  Agents  and  Users 
VAL  block. size    IS    INT16(16)  :    --  bytes 
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--  intcmds  -  defines  other  internal  commands  used 
-  and  link  translation  table 

--  predefined  values 

--  signal  to  reset 

--  output  is  done 

—  tram  done 

—  Agent  to  User 
--  Mgr  to  Agent 

—  Mgr  to  Agent 

—  nil  defined 

—  for  IF  stmts 

—  for  fault  recovery 

—  Mgr  to  User 

—  Mgr  to  User 

--  array  to  convert  C004  link  number  to  linkO  or  3  of  Tram 
slot 

--  index  0-31  for  c41inks  0-31 

—  data  INT  is   0-15  for  Trams  0-15  linkO 

20-35  for  Trams  0-15  linTc3 
VAL  [32]  INT  to. slot  IS 

[  22,  25,  21,   5,  2,   1,   6,  26, 

15,   8,  35,  31,  12,  32,  28,  11, 

7,  23,   4,   3,  27,  24,   0,  20, 

34,  13,  14,  33,  9,  29,  30,  10]  : 


VAL 

failed 

IS 

60 

(BYTE) 

VAL 

reinit 

IS 

61 

(BYTE) 

VAL 

done . with. link 

IS 

63 

(BYTE) 

VAL 

all .done 

IS 

65 

(BYTE) 

VAL 

ready .to. rev 

IS 

67 

(BYTE) 

VAL 

link .made 

IS 

68 

(BYTE) 

VAL 

link. gone 

IS 

69 

(BYTE) 

VAL 

byte . nil 

IS 

99 

(BYTE) 

VAL 

BOOL  otherwise 

IS 

TRUE: 

VAL 

INT  user. failed 

IS 

60 

VAL 

INT  user. reinit 

IS 

61 

VAL 

INT  user . linkmade 

IS 

68 

--  faultime  -  used  by  Agents  for  watchdog  timer 
VAL  fault. time    IS    INT (1024)  :    —  clock  ticks 


--  edgedata  -  defines  B012  PI  edge  connections  wired 

--  number  of  connections  suitable  for  0  to  0  links 
VAL  INT  edgeO.ct  IS  8  : 

--  number  of  connections  suitable  for  3  to  3  links 
VAL  INT  edge3.ct  IS  8  : 

--  each  edge.a[x]  is  manually  patched  to  edge.b[x] 
VAL  edge. a  IS  [  4  (BYTE) ,   6  (BYTE) ,  31  (BYTE) ,  25  (BYTE) , 

19  (BYTE),  16 (BYTE),   8 (BYTE) ,  12 (BYTE) , 
O(BYTE),   KBYTE),  29  (BYTE),  24  (BYTE)  , 

17  (BYTE),  21  (BYTE),  10 (BYTE)  ,  14  (BYTE)  ] 


VAL  edge.b  IS 


[  5 (BYTE) , 

22 (BYTE) , 

2 (BYTE) , 

23  (BYTE)  , 


3  (BYTE),  28  (BYTE),  26  (BYTE), 

18 (BYTE),  9 (BYTE),  15 (BYTE) , 

7  (BYTE),  30  (BYTE),  27  (BYTE)  , 

20  (BYTE),  13  (BYTE),  11  (BYTE)] 
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—  B012  Message  Exhcange  w/2  C004's  &  16 (max)  TRAMS 
--  Code  for  T212  Crossbar  Controller 

—  Note:  B004  monitoring  code  is  seperate 


PROC   controller  (CHAN  OF  ANY  from. pipe,  to. pipe, 

to.cO,  from.cO,  to.cl,  from.cl, 
VAL  INT  no. trams) 

--  data  library  calls 
#USE  "c4cmds.tsr" 
#USE  "ctrlcmds.tsr" 
#USE  "intcmds.tsr" 
#USE  "edgedata.tsr" 

VAL  delay  IS  16000  :  --  pause  in  status  reset 

BYTE  token,  source,  dest  : 

BYTE  src.conn,  dst.conn  :     --  holds  current  hookups 
BOOL  continue,  accomplished  : 

[16] BOOL  edge. ok  :  --  edge  status  array 

INT  i,  now,  edge. index,  req.cnt  : 

TIMER  clock  :  --  timer  for  edge  reset 


--  Subroutine  to  set  a  link  to  nil  if  inactive, 
or  reset  high  bit  to  low  if  active 


PROC  reset. or. nil  (BYTE  c41ink) 

IF 

--  if  high  bit  is  set  (link  is  acitve) 
( (INT (c41ink) )BITAND (INT (BYTE  #80) ))= (INT (BYTE  #80)) 

--  then  reset  high  bit  to  low  for  valid  link  number 
c41ink  :=  BYTE(  (INT(c41ink) )  ><  (INT  (BYTE  (#80) )) ) 

otherwise 

--  high  bit  is  low  therefore  link  is  NOT  active 
c41ink  :=  byte. nil 

:  —  end  of  PROC  reset. or. nil 
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--  Subroutine  to  handle  C004  commands  and  response 

PROC  c4.cmd  (CHAN  OF  ANY  to.x,  fm.x, 

VAL  BYTE  cmd,  bl,  b2,  BYTE  b3) 

SEQ 

IF  —  determine  command  to  use 

cmd  =  byte. nil         —  just  do  enquire 

SKIP 
bl  =  byte. nil  —  single  parameter  command 

to.x  !  cmd;  b2 
otherwise 

to.x  !  cmd;  bl;  b2    --  two  parameter  command 

to.x  !  c4. enquire;  b2    --  verify  action  by  enquire 
fm.x  ?  b3 
reset . or. nil (b3) 
:  --  end  of  c4.cmd 


--  Subroutine  to  get  current  connection  to 

requested  hookup.  All  i/o  in  c4  link  desigs 

PROC  get .current .tie  (CHAN  OF  ANY  to.cO,  from.cO, 

to.cl,  from.cl, 
VAL  BYTE  in. link,  BYTE  link . tied. to) 

--  Note:  all  Bytes  in  terms  of  C004  Link  designations 
SEQ 

IF  --  determine  if  in. link  is  already  connected 
in. link  =  byte. nil 

SKIP 
( (to. slot [INT(in. link) ] )  <  (16))   —  it  is  a  linkO 
--  get  input  link  (if  any)  connected  to  in. link 
c4.cmd(to.cl,  from.cl/  byte. nil,  byte. nil, 
in. link,  link. tied. to) 

(  (to. slot [INT(in. link) ] )  >  (19))   —  it  is  a  linkS 
--  get  input  link  (if  any)  connected  to  in. link 
c4 . cmd (to . cO,  from.cO,  byte. nil,  byte. nil, 
in. link,  link. tied. to) 

otherwise 
SKIP 
:   --  end  of  get . current .tie 
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--  Subroutine  to  make  a  connection  between  two  C004  links, 
veirfy  connection  is  made,  and  return  status  flag 

PROC  make. conn  (CHAN  OF  ANY  to . cO, f rom. cO, to . cl, f rom. cl, 

VAL  BYTE  source,  dest,  BOOL  conn. ok) 


--  SubSubroutine  to  handle  a  0-0  or  3-3  link 

PROC  make. 0033  (CHAN  OF  ANY  to. a,  fm.a,  to.b,  fm.b, 

VAL  BYTE  src,  dst, 

VAL  BOOL  link.OtoO,  BOOL  conn03.ok, 
BYTE  in. a,  in.b) 

BOOL  edge. used  : 

INT  index,  max. index  : 

BYTE  inputaa,  inputbb  : 

SEQ  --  select  appropriate  index  ranges  for  link 
IF 

link. OtoO 
SEQ 

index  :=  0 

max. index  :=  edgeO.ct 
otherwise 
SEQ 

index  :=  edgeO.ct 

max. index  :=  edgeO.ct  +  edge3.ct 

edge. used  :=  TRUE 

WHILE  (edge. used  AND  (index  <  max. index)) 
IF 

edge . ok [index]  --  edge  is  still  "good" 
SEQ 

c4.cmd(to.a,  fm.a,  byte. nil,  byte. nil, 

edge . a [index] ,  inputaa) 
c4.cmd(to.a,  fm.a,  byte. nil,  byte. nil, 
edge. b [index] ,  inputbb) 

IF 

((inputaa  =  byte. nil)  AND 
(inputbb  =  byte. nil)) 

SEQ  --  if  edge  not  currently  connected 
edge. used  :=  FALSE 
c4.cmd(to.a,  fm.a,  c4 . input . output, 

src,  edge. a [index] ,  in. a) 
c4.cmd(to.b,  fm.b,  c4 . input . output , 
edge.b [index] ,  dst,  inputbb) 
c4.cmd (to. a,  fm.a,  c4 . input . output. 
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dst,  edge. b [index] ,  in.b) 
c4.cmd(to.b,  fm.b,  c4 . input . output , 
edge. a [index] ,  src,  inputaa) 

IF 

((inputaa  =  edge. a [index] )  AND 
(inputbb  =  edge. b [index] ) ) 
conn03.ok  :=  TRUE 
otherwise 

conn03.ok  :=  FALSE 

otherwise 

index  :=  index  +1   --  busy,  try  next 

otherwise 

index  :=  index  +1  —  get  next  edge,  "bad" 

—  end  of  make. 0033 


--  Main  Section  for  PROC  make. conn 
BYTE  linka,  linkb,  inputa,  inputb  : 
BOOL  crosslink, ok  : 

SEQ 

crosslink. ok  :=  TRUE 

--  make  linka  linkO  of  two  to  be  connected  if  other 

is  link3.   If  both  linkO's  3's,  no  matter 
IF 

(  (to.slot [INT(source) ] )  <  (16)) 
SEQ 

linka  :=  source 
linkb  :=  dest 
otherwise 
SEQ 

linkb  :=  source 
linka  :=  dest 

IF 

((  (to.slot [INT(linka) ] )  <  (16))  AND 
(  (to.slot [INT(linkb) ] )  >  (19))) 
SEQ 

c4 . cmd (to . cO,  from.cO,  c4 . input . output, 

linka,  linkb,  inputa) 
c4 . cmd (to . cl,  from.cl,  c4 . input . output , 
linkb,  linka,  inputb) 

((  (to.slot [INT(linka) ] )  <  (16))  AND 
(  (to.slot [INT(linkb) ] )  <  (16))) 

make. 0033 (to. cO,  from.cO,  to.cl,  from.cl,  linka, 
linkb,  TRUE,  crosslink . ok,  inputa,  inputb) 
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(  {  (to.slot [INT(linka)  ] )  >  (19))  AND 
{  (to.slot [INT(linkb)  ] )  >  (19))) 
make. 0033  (to. cl,  from.cl,  to.cO,  from.cO,  linka, 

linkb,  FALSE,  crosslink . ok,  inputa,  inputb) 

otherwise 
SKIP 

--  verify  return  input  enq  bytes  match  (high  bit  set) 
IF   --  links  match,  connected  ok 

( ( (linka  =  inputa)  AND  (linkb  =  inputb) ) 
AND  crosslink. ok) 
conn. ok  :=  TRUE 
otherwise  --  links  are  not  made 
conn. ok  :=  FALSE 
■  end  of  make. conn 


--  Subroutine  to  brk  a  connection  between  two  Tram  Links 

(may  involve  as  many  as  4  C004  links) 
—   veirfy  connection  is  broken,  and  return  status  flag 

PROC  break. conn  (CHAN  OF  ANY  to.cO,  from.cO, 

to.cl,  from.cl, 
VAL  BYTE  source,  dest,  BOOL  broken. ok) 


--  SubSubroutine  to  handle  a  0-0  or  3-3  link 

PROC  break. 0033  (CHAN  OF  ANY  to. a,  fm.a,  to.b,  fm.b, 

VAL  BYTE  src,  dst, 
BYTE  in. a,  in.b,  BOOL  break03.ok) 

BYTE  edgea,  edgeb  : 

SEQ 

--  get  edges  used 

to. a  !  c4. enquire;  src 

fm.a  ?  edgea 

to. a  !  c4. enquire;  dst 

fm.a  ?  edgeb 

--  disconnect  the  edges 

c4.cmd(to.b,  fm.b,  c4 .disconn. output,  byte. nil, 

edgea,  in. a) 

c4.cmd(to.b,  fm.b,  c4 .disconn. output,  byte. nil, 

edgeb,  in.b) 

IF 

((in. a  =  byte. nil)  AND  (in.b  =  byte. nil)) 
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break03.ok  :=  TRUE 
otherwise 

break03.ok  :=  FALSE 

c4 . cmd (to . a,  fm.a,  c4 . disconn. output,  byte. nil, 

src,  in. a) 
c4 . cmd (to. a,  fm.a,  c4 .disconn. output,  byte. nil, 

dst,  in.b) 
--  end  of  break. 0033 


—  Main  Section  of  PROC  break. conn 
BYTE  linka,  linkb,  inputa,  inputb  : 
BOOL  edge. free  : 


edge. free  := 

=  TRUE 

IF 

( (to. Slot [INT (source) ] )  < 

(16)) 

SEQ 

linka 

:=  source 

linkb 

:=  dest 

otherwise 

SEQ 

linkb 

:=  source 

linka 

:=  dest 

IF 

--  break  a  tram  link  0  to  tram  link  3 
(( (to. slot [INT  (linka) ] )  <  (16))  AND 
( (to.slot [INT(linkb) ] )  >  (19))) 
SEQ 

c4 . cmd (to . cl,  from.cl,  c4 . disconn. output, 

byte. nil,  linka,  inputa) 
c4 . cmd (to. cO,  from.cO,  c4 . disconn. output, 
byte. nil,  linkb,  inputb) 

(( (to.slot [INT  (linka) ] )  <  (16))  AND 
( (to.slot [INT (linkb) ] )  <  (16))) 
break. 0033  (to.cl,  from.cl,  to.cO,  from.cO, 

linka,  linkb,  inputa,  inputb,  edge. free) 

(( (to.slot [INT(linka) ] )  >  (19))  AND 
( (to.slot [INT(linkb) ] )  >  (19))) 
break. 0033  (to.cO,  from.cO,  to.cl,  from.cl, 

linka,  linkb,  inputa,  inputb,  edge. free) 

otherwise 
SKIP 
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--  verify  broken  links  inactactive  (high  bit  off) 
IF 

--  links  verified  broken 

({(inputa  =  byte. nil)  AND  (inputb  =  byte. nil)) 
AND  edge. free) 
broken. ok  :=  TRUE 
otherwise 

--  links  not  broken 
broken. ok  :=  FALSE 
:  --  end  of  break. conn 


--  main  body  of  Controller  Procedure 

SEQ 

--  initialization 
to.cO  !  c4. reset 
to . cl  !  c4 . reset 
SEQ  i  =  0  FOR  16 

edge. ok [i]  :=  TRUE 
edge. index  :=  0 
clock  ?  now 
req.cnt  :=  0 

WHILE  TRUE 
ALT 

from. pipe  ?  token;  source;  dest 
SEQ 
IF 

token  =  link.req 
SEQ 

get . current .tie  (to.cO,  from.cO,  to.cl, 

from.cl,  source,  src.conn) 
get . current .tie  (to.cO,  from.cO,  to.cl, 
from.cl,  dest,  dst.conn) 

IF 

((src.conn  =  byte. nil)  AND 
(dst.conn  =  byte. nil)) 
--  connections  free,  make  the  request 
SEQ 

make. conn (to. cO,  from.cO,  to.cl, 

from. cl,  source, dest, accomplished) 
IF 

accomplished 
SEQ 

--  conn  made  OK,  send  ACK 
to. pipe  !  link.ack;  source; 
dest 
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otherwise 
SEQ 

--  for  whatever  reason,  no  go 
to. pipe  !  link.req;  source; 

dest 
req.cnt  :=  req.cnt  +  1 

otherwise  --  repeat  request  later 
to. pipe  !  link.req;  source;  dest 

token  =  link.rel 

--  break  an  existing  connection  btw  2  Trams 
SEQ 

break. conn  (to. cO,  from.cO,  to.cl,  from.cl, 

source,  dest,  accomplished) 
IF 

accomplished   --  links  broken 
SEQ 

to. pipe  !  link.brk;  source;  dest 
otherwise     --links  not  broken, 

--  let  link.rel  loop  again 
to. pipe  !  link.rel;  source;  dest 

token  -    link.abt 
SEQ 

to. pipe  !  link.abt;  source;  dest 

--  pass  to  originator 
get . current . tie  (to.cO,  from.cO,  to.cl, 

from.cl,  source,  src.conn) 
get . current .tie  (to.cO,  from.cO,  to.cl, 

from.cl,  dest,  dst.conn) 
continue  :=  TRUE 
i  :=  0 

WHILE  ((i  <  16)  AND  continue) 
SEQ 
IF 

(  ( (edge. a [i] ) =src. conn) 
OR  ( (edge. b [i] ) =src . conn) ) 
SEQ 

edge. Ok [i]  :=  FALSE 
continue  :=  FALSE 
to. pipe  !  BYTE(i+220);  byte. nil; 
byte. nil 
otherwise 
i  :=  i  +  1 

to. pipe  !  link.rel;  source;  dest 

break . conn  (to . cO, from. cO, to . cl, from. cl, 

source,  dest,  accomplished) 
make . conn   (to . cO, from. cO, to . cl, from. cl, 

source,  dest,  accomplished) 
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IF 

accomplished 

to. pipe  !  link.ack;  source;  dest 
otherwise 
SEQ 

to. pipe  !  link.req;  source;  dest 
req.cnt  :=  req.cnt  +  1 

token  =  link.ack 

--consume  rest  of  acknowledge  packet 

SKIP 
token  =  link.brk 

--  msg  cycled,  consume 

SKIP 
token  =  link.rei 

--consume  rest  of  acknowledge  packet 

SKIP 
otherwise 

--  pass  it  on  for  testing  and  monitoring 

to. pipe  !  token;  source;  dest 

(req. cnt>no. trams)  &  clock  ?  AFTER  now  PLUS  delay 
SEQ 

clock  ?  now 

req.cnt  :=  0 

edge. ok [edge. index]  :=  TRUE 

to. pipe  !  BYTE (edge. index+200) ;  byte. nil; 

byte .nil 
edge. index  :=  edge. index  +  1 
IF 

edge. index  >=  (edgeO.ct  +  edge3.ct) 

edge. index  :=  0 
otherwise 
SKIP 
end  of  controller  code 
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—  TRAM. CODE  -  code  for  each  TRAM,  performing  work  & 

requesting  links 

PROC  tram. code  (CHAN  OF  ANY  to. pipe,  from. pipe, 

linkO.in,  linkO . out, link3 . in,  link3.out, 
VAL  INT  slot.desig,  no. trams) 


--  LINK. MANAGER  -  maintains  pipe  local  link  control 


PROC  link. manager 


(CHAN  OF  ANY  to. pipe,  from. pipe, 
mgr . to. agentO,  mgr . f rom. agentO, 
mgr . to. agent3,  mgr . from. agent3, 
mgr . from. userO,  mgr . from. user3, 

CPiAN  OF    INT  mgr .  to .  userO,    mgr  .  to  .user3, 

VAL  INT  slot.desig,  no. trams) 


#USE  "c4cmds.tsr" 
#USE  "ctrlcmds.tsr" 
#USE  "intcmds.tsr" 
BYTE  linkO,  link3  : 


BYTE  linkO.tied.to,  link3 

Ltied.tc 

)  : 

BYTE  token,  bytel,  byte2. 

req. source,  req.dest  : 

--  index  corresponds  to  s 

;lot  number  w/   0-15  for  linkO 

-- 

and  w/  20-35  for  link3 

--  stored  value  is  hardwired  C004 

[  link  to  linkO  or  3 

VAL  link03.desig  IS  [22, 

5,   4, 

19,  18,   3, 

6, 

16,   9, 

28,  31,  15, 

12, 

25,  26, 

8,  99,  99, 

99, 

99,  23, 

2,   0,  17, 

21, 

1,   7, 

20,  14,  29, 

30, 

11,  13, 

27,  24,  10]: 

VAL  INT  buffer. size   IS  3  : 
[buffer. size-ljCHAN  OF  ANY  b: 
[buffer. size-2]BYTE  t,  s,  d  : 
CHAN  OF  ANY  to. buffer  : 
BYTE  t.in,  s.in,  d.in,  t.out. 


s.out,  d.out 


PAR 
SEQ 

linkO  :=  BYTE (link03 . desig [slot . desig] ) 
link3  :=  BYTE (link03 .desig [ (slot .desig+20) ] ) 
linkO.tied.to  :=  byte. nil 
link3 . tied. to  :=  byte. nil 
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WHILE  TRUE 
ALT 

mgr , from. agent 0  ?  token 
IF 

(token  =  done . with. link) 

to. buffer  !  link.rel;  linkO; 
linkO . tied. to 
(token  =  failed) 

to. buffer  !  link.abt;  linkO; 
linkO .tied. to 
(token  =  all. done) 

to. buffer  !  BYTE (slot . desig+100) ; 
byte. nil;  byte. nil 
(  (INT (token) )  <  36) 
SEQ 

--  token  is  a  link  designation 
linkO . tied. to  := 

BYTE  (link03.desig[INT (token) ] ) 
to. buffer  !  link.req;  linkO; 
linkO.tied.to 
otherwise 
SKIP 

mgr . from. agents  ?  token 
IF 

(token  =  done .with. link) 

to. buffer  !  link.rel;  linkS; 
links .tied. to 
(token  =  failed) 

to. buffer  !  link.abt;  linkS; 
links . tied. to 
(token  =  all. done) 

to. buffer  !  BYTE (slot . desig+120) ; 
byte. nil;  byte. nil 
(  (INT (token) )  <  S6) 
SEQ 

--  token  is  a  link  designation 
links . tied. to  := 

BYTE  (linkOS.desig[ INT (token) ] ) 
to, buffer  !  link.req;  linkS; 
links .tied. to 

otherwise 
SKIP 

from, pipe  ?  token;  bytel;  byte2 
IF 

token  =  link.req 

to. buffer  !  token;  bytel;  byte2 
token  =  link,rel 

to. buffer  !  token;  bytel;  byte2 
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token  =  link.ack 
SEQ 
IF 

--  if  local  link  involved, 
inform  user  to  xmit 
byte2  =  linkO 
SEQ 

mgr . to.userO  !  user . linkmade 
byte2  :=  byte. nil 
byte2  =  link3 
SEQ 

mgr.to.userS  !  user . linkmade 
byte2  :=  byte. nil 
otherwise 
SKIP 

IF 

--  if  local  link  involved, 
inform  user  to  xmit 
bytel  =  linkO 
SEQ 

mgr . to. agentO  !  link. made 
bytel  :=  byte. nil 
bytel  =  link3 
SEQ 

mgr .to . agents  !  link. made 
bytel  :=  byte. nil 
otherwise 
SKIP 

IF 

(  (bytel=byte.nil) AND (byte2=byte . nil) ) 

SKIP  --  consume 
otherwise 

to. buffer  !  token;  bytel;  byte2 

token  =  link.brk 
IF 

--  if  local  links  involved, 

clear  user  status,  consume 
bytel  =  linkO 
SEQ 

linkO . tied. to  :=  byte. nil 
mgr . to. agentO  !  link. gone 
bytel  =  links 
SEQ 

links .tied. to  :=  byte. nil 
mgr . to . agents  !  link. gone 
otherwise 

--  pass  on  if  not  of  local  interest 
to. buffer  !  token;  bytel;  byte2 
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token  =  link.abt 

--  since  originated  by  source, 

consume  if  this  is  source 
SEQ 
IF 

(  (bytel  =  linkO)  OR  (bytel  =  link3) ) 

SKIP   — consume 
otherwise 

to. buffer  !  token;  bytel;  byte2 

IF 

byte2  =  linkO 

mgr.to.userO  !  user. failed 
byte2  =  link3 

mgr.to.user3  !  user. failed 
otherwise 

SKIP 

token  =  link.rei 
SEQ 
IF 

--  if  local  link  involved, 
inform  user  to  xmit 
bytel  =  linkO 
SEQ 

mgr . to.agentO  !  reinit 
bytel  :=  byte. nil 
bytel  =  links 
SEQ 

mgr . to . agents  !  reinit 
bytel  :=  byte. nil 
otherwise 
SKIP 

IF 

--  if  local  link  involved, 
inform  user  to  xmit 
byte2  =  linkO 
SEQ 

mgr.to.userO  !  user. reinit 
byte2  :=  byte. nil 
byte2  =  linkS 
SEQ 

mgr.to.userS  !  user. reinit 
byte2  :=  byte. nil 
otherwise 
SKIP 
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IF 

( (bytel=byte.nil) AND (byte2=byte . nil) ) 

SKIP 
otherwise  • 

to. buffer  !  token;  bytel;  byte2 

otherwise 

—  non-cmd  byte,  pass  it  on 
to. buffer  !  token;  bytel;  byte2 

PAR 

WHILE  TRUE 
SEQ 

to. buffer  ?  t.in;  s.in;  d.in 
b[0]  !  t.in;  s.in;  d.in 

PAR 

PAR  p  =  0  FOR  (buffer. size-2) 
WHILE  TRUE 
SEQ 

b[p]     ?    t[p];    s[p];    d[p] 
b[p+l]     !    t[p];    s[p];    d[p] 

WHILE  TRUE 
SEQ 

b [buffer . size-2 ]  ?  t.out;  s.out;  d.out 
to. pipe  !  t.out;  s.out;  d.out 
:  --  end  of  manager 


--  AGENTO  -  handle  data  msg  output  for  userO 

PROC  agentO  (CHAN  OF  ANY  mgr . to . agentO,  mgr. from. agent 0, 

agent 0 .to.userO, 
agentO . f rom.userO,  linkO.out, 
VAL  INT  link.desig) 

#USE  reinit 

#USE  "intcmds.tsr" 

#USE  "blcksize.tsr" 

#USE  "faultime.tsr" 

VAL  block. len  IS  INT (block . size)  : 

BYTE  byte. in  : 

[block. len] BYTE  msg  : 

BOOL  continue,  aborted  : 

TIMER  time.c  : 

INT  now,  check. time  : 
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SEQ 

agentO . to . userO  !  ready .to . rev 
continue  :=  TRUE 
WHILE  continue 
PRI  ALT 

mgr . to . agentO  ?  byte. in 
IF 

byte. in  =  link. made 
SEQ 

msg[0]  :=  BYTE (link. desig) 

--  repl  dest  w/  source 
time.c  ?  now 

check. time  :=  now  PLUS  fault. time 
Output OrFail . t (linkO . out^msg, time . c, 

check. time, aborted) 
IF 

aborted 
SEQ 

mgr . from. agentO  !  failed 
mgr .to . agentO  ?  byte. in 
Reinitialise (link 0. out) 
otherwise 

mgr . from. agentO  !  done . with. link 

byte. in  =  link. gone 

agentO .to .userO  !  ready. to. rev 

otherwise 
SKIP 

agentO . from. userO  ?  msg 
SEQ 

mgr . from. agentO  !  msg[0] 
IF 

msg[0]  =  all. done 

continue  :=  FALSE 
otherwise 
SKIP 
--  end  of  agentO 
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--  USERO  -  production  process  using  linkO 

PROC  userO  (CHAN  OF  ANY  agentO . to . userO, 

agentO . f rom.userO,  mgr . from. user 0,  linkO.in, 
CHAN  OF  INT  mgr . to . userO, 
VAL  INT  link.desig,  no. trams) 

#USE  reinit 
#USE  snglmath 
#USE  "intcmds.tsr" 
#USE  "blcksize.tsr" 

--  user  supplied  values 
VAL  block. len   IS  INT  (block. size)  : 
[block. len]BYTE  data. out,  data. in  : 
BYTE  in. byte  : 

INT  req.dest,  counter,  check . time, now,  any.int,  seedlG: 
BOOL  continue,  aborted,  active  : 
INT32  seed  : 
REAL32  result  : 
TIMER  time.c,  clock  : 

PRI  PAR 

WHILE  TRUE 
SEQ 

mgr.to.userO  ?  any.int  --  trigger  communications 
IF 

any.int  =  user . linkmade 
SEQ 

InputOrFail . c (linkO . in,  data. in, 

mgr.to.userO,  aborted) 
IF 

aborted 
SEQ 

mgr.to.userO  ?  any.int 
IF 

any.int  =  user. reinit 

Reinitialise (linkO . in) 
otherwise 
SKIP 

otherwise 

SKIP   —  data  rcvd  OK,  use  as  desired 

any.int  =  user. reinit 
Reinitialise (linkO . in) 

otherwise 
SKIP 
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SEQ 
IF 

link.desig  <  no. trams 
SEQ 

continue  :=  TRUE 
counter  :=  0 
otherwise 
SEQ 

continue  :=  FALSE 
counter  :=  10000 

clock.  ?  seedlG 

seed  :=  INT32 (seedl6) 

--seed  :=  INT32 (link. desig*100) 

WHILE  counter  <  1000 
SEQ 

counter  :=  counter  +  1 
--  generate  data 
SEQ  i  =  1  FOR  (block. len-1) 
data. out [i]  :=  100  (BYTE) 

--  send  to  any  link  0  or  3 

req.dest  :=  link.desig 

WHILE  ((req.dest  =  link.desig)  OR 

((req.dest  >=  no. trams)  AND 

(req.dest  <  20) )  OR 

(req.dest  >=  (no . trams+20) ) ) 
--  for  link  0-0  ONLY!   (commented  out  otherwise) 
--WHILE  ( (req.dest=link.desig)  OR 
-- (req.dest>=no. trams) ) 

SEQ 

result, seed  :=  RAN(seed) 

req.dest  :=  INT  ROUND  (result *31 . 0 (REAL32) ) 

--  put  destination  address  in  array  at  addr  0 
data.out[0]  :=  BYTE (req. dest ) 
agentO . to.userO  ?  in. byte 
agentO . f rom.userO  !  data. out 

agentO .to.userO  ?  in. byte 
agentO . from. userO  !  all. done 
•  end  of  userO 
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--  AGENT3  -  handle  data  msg  output  for  user3 

PROC  agent3  (CHAN  OF  ANY  mgr . to . agentS,  mgr . from. agent 3, 

agent 3 .to .user 3, 
agent3 . f rom. user3,  link3.out, 
VAL  INT  link.desig) 

#USE  reinit 

#USE  "intcmds.tsr" 

#USE  "blcksize.tsr" 

#USE  "faultime.tsr" 

VAL  block. len  IS  INT (block . size)  : 

BYTE  byte. in  : 

[block. lenJBYTE  msg  : 

BOOL  continue,  aborted  : 

TIMER  time.c  : 

INT  now,  check. time  : 

SEQ 

agent3 . to . user3  !  ready. to. rev 
continue  :=  TRUE 
WHILE  continue 
PRI  ALT 

mgr . to . agent3  ?  byte. in 
IF 

byte. in  =  link. made 
SEQ 

msg[0]  :=  BYTE (link. desig) 

--  repl  dest  w/  source 
time.c  ?  now 

check. time  :=  now  PLUS  fault. time 
Output Or Fail . t (link3 . out, msg, time . c, 

check. time, aborted) 
IF 

aborted 
SEQ 

mgr . from. agent3  !  failed 
mgr . to. agent3  ?  byte. in 
Reinitialise  (link 3 . out) 
otherwise 

mgr . from. agent3  !  done .with. link 

byte. in  =  link. gone 

agent3 . to .user3  !  ready. to. rev 

otherwise 
SKIP 
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agent3 . f rom. user3  ?  msg 
SEQ 

mgr. from. agents  !  msg[0] 
IF 

msg[0]  =  all. done 

continue  :=  FALSE 
otherwise 
SKIP 
:  --  end  of  agentS 


--  USER3  -  production  process  using  linkS 

PROC  users  (CHAN  OF  ANY  agentS . to . userS, 

agents . from. user3,  mgr . f rom.userS,  linkS.in, 
CHAN  OF  INT  mgr . to . userS, 
VAL  INT  link.desig,  no. trams) 

#USE  reinit 
#USE  snglmath 
#USE  "intcmds.tsr" 
#USE  "blcksize.tsr" 

--  user  supplied  values 
VAL  block. len   IS  INT (block. size)  : 
[block. len] BYTE  data. out,  data. in  : 
BYTE  in. byte  : 

INT  req.dest,  counter,  check. time, now,  any.int,  seedl6: 
BOOL  continue,  aborted,  active  : 
INT32  seed  : 
REAL32  result  : 
TIMER  time.c,  clock  : 

PRI  PAR 

WHILE  TRUE 
SEQ 

mgr.to.user3  ?  any.int  --  trigger  communications 
IF 

any.int  =  user . linkmade 
SEQ 

InputOrFail . c (links . in,  data. in, 

mgr . to.userS,  aborted) 
IF 

aborted 
SEQ 

mgr. to.userS  ?  any.int 
IF 

any.int  =  user. reinit 
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Reinitialise (link 3. in) 
otherwise 
SKIP 

otherwise 

SKIP   --  data  rcvd  OK,  use  as  desired 

any.int  =  user.reinit 
Reinitialise (link 3 . in) 

otherwise 
SKIP 

SEQ 
IF 

link.desig  <  (no. trams  +  20) 
SEQ 

continue  :=  TRUE 
counter  :=  0 
otherwise 
SEQ 

continue  :=  FALSE 
counter  :=  10000 

clock  ?  seedlG 

seed  :=  INT32 (seedl6) 

— seed  :=  INT32 (link. desig*100) 

WHILE  counter  <  1000 
SEQ 

counter  :=  counter  +  1 
--  generate  data 
SEQ  i  =  1  FOR  (block. len-1) 
data.out[i]  :=  lOO(BYTE) 

--  send  to  any  link  0  or  3 

req.dest  :=  link.desig 

WHILE  ((req.dest  =  link.desig)  OR 
((req.dest  >=  no. trams)  AND 
(req.dest  <  20) )  OR 
(req.dest  >=  (no. trams+20) ) ) 

--  for  link  3-3  ONLY!  (comment  out) 
— WHILE  ((req.dest  =  link.desig)  OR 
—  (req.dest  <  20)  OR 
--(req.dest  >=  (no .trams+20) ) ) 

SEQ 

result, seed  :=  RAN(seed) 

req.dest  :=  INT  ROUND  (result *31 . 0 (REAL32) ) 
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--  put  destination  address  in  array  at  addr  0 
data.out[0]  :=  BYTE (req.dest) 

agents . to . user3  ?  in. byte 
agents . from, user3  !  data. out 

agents .to. userS  ?  in. byte 
agents . from, userS  !  all. done 
end  of  userS 


--  TRAM. CODE  -  master  program  for  each  Tram 

—  internal  channels  between  users  and  manager 
CHAN  OF  ANY  mgr . to . agentO,  mgr . f rom. agentO, 
mgr .to. agents,  mgr . f rom. agentS, 
mgr , f rom. userO,  mgr , f rom.userS, 
agent 0 , to . userO,  agent S , to .userS, 
agent 0 ,from.userO,  agent S .from.userS 
CHAN  OF  INT  mgr , to , userO ,  mgr.to.userS  : 

PRI  PAR 
PAR 

link, manager  (to. pipe,  from. pipe, 

mgr , to . agentO,  mgr . from. agentO, 
mgr , to . agents,  mgr . from. agentS, 
mgr . from. userO,  mgr . from.userS, 
mgr . to. userO,  mgr.to.userS, 
slot.desig,  no. trams) 

agentO  (mgr . to . agentO,  mgr . from. agentO, 

agentO , to .userO,  agentO . f rom.userO, 
linkO.out,  slot.desig) 

agents  (mgr . to. agentS,  mgr . from. agentS, 

agents . to. user S,  agentS . from. userS, 
links. out,  slot .desig+20) 

PAR 

userO  (agentO . to. userO,  agentO . f rom.userO, 
mgr . from. userO,  linkO.in, 
mgr . to .userO,  slot.desig,  no. trams) 

userS  (agents , to. userS,  agentS . from. userS, 
mgr . from.userS,  linkS.in, 
mgr.to.userS,  slot .desig+20,  no. trams) 

—  end  of  TRAM. CODE 
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—  Placed  Pars  (Controller  ->  T212,  TramCode  ->  T800's) 
--  Global  Declarations,  Constants  and  Placed  Pars 

#USE  "links. tsr" 

--  control  info  passing  pipe  btw  T212  and  Trams 
[no.trams+2]CHAN  OF  ANY  pipe  : 

—  links  btw  T212  and  C004s 
CHAN  OF  ANY  t2.to.cO,  t2.from.cO, 

t2.to.cl,  t2.from.cl  : 

--  !!!!  make  protocol  for  linkO's  &  3's  !!!! 
--  data  xfer  channels  between  all  Trams  (via  C004's) 
[no. trams] CHAN  OF  ANY  linkO.to.cO,  linkO . f rom. cl, 

link3.to.cl,  link3 . f rom. cO  : 


PLACED  PAR 

PROCESSOR  no. trams  T2 
PLACE  pipe[0] 
PLACE  pipe [no.trams+1] 
PLACE  t2.to.c0 
PLACE  t2.from.cO 
PLACE  t2.t0.cl 
PLACE  t2.from.cl 


AT  link2in 
AT  linklout 
AT  linkOout 
AT  linkOin 
AT  linkSout 
AT  linkSin 


controller 


(pipe[0],  pipe [no. trams+1] , 
t2.to.cO,  t2.from.c0, 
t2.to.cl,  t2.from.cl,  no. trams) 


PLACED  PAR  slot  =  0  FOR  (no. trams) 


PROCESSOR  (  (slot  +  1) *100)  T8 
PLACE  pipe [slot] 
PLACE  pipe [slot+1] 
PLACE  linkO. from. cl [slot] 
PLACE  Iink0.to.c0[slot] 
PLACE  links. from. cO[slot] 
PLACE  links. to. cl [slot] 


AT  linklout 
AT  link2in 
AT  linkOin 
AT  linkOout 
AT  linkSin 
AT  linkSout 


tram. code  (pipe[slot],  pipe [slot+1] , 

linkO . f rom.cl [slot],  Iink0.to.c0[slot], 
links . from. cO[slot],  link3.to.cl[slot], 
slot,  no. trams) 
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APPENDIX    B 
MESSAGE    EXCHANGE    TEST    CODE    FOR    B004 


—  TIMER  -  code  for  B004  Timer  of  B012  activity  in  MSGX 
--  Operating  on  B004  Evaluation  Board 

#USE  userio 

#USE  "links. tsr"  —  transputer  link  defns 

#USE  "blcksize . tsr"  --  data  block  size  for  ref 


CHAN  OF  ANY  up. in,  up. out,  echo . to .buff er  : 

--  place  links 

PLACE  up. in       AT  link2in   : 

PLACE  up. out      AT  linkSout  : 

PROC  timer  (CHAN  OF  INT  keyboard, 

CHAN  OF  ANY  screen,  up. in,  up. out, 
VAL  INT  no. trams) 

#USE  "intcmds.tsr" 

VAL  terminate   IS  199  (BYTE) :  —  shuts  down  buffer  procs 

BYTE  command. byte,  bytel,  byte2  : 

BOOL  more  : 

TIMER  clock  : 

[37] [7] INT  counter  :  --  various  command  counters 

[6] INT  total,  grand  : 

[36] INT  user .done  : 

INT  key. in,  done. count,  sum,  sumtot,  grandtot  : 

INT  start,  stop,  elapsed,  slot. temp  : 


—  main  code  for  B004  monitor  -  TIMER 

SEQ 

--  initialize  loop 
more  :=  TRUE 
done. count  :=  0 
sumtot  :=  0 
grandtot  :=  0 
SEQ  i  =  0  FOR  36 

user.done[i]  :=  0 
SEQ  i  =  0  FOR  37 
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SEQ  j  =  0  FOR  7 

counter [i] [ j]  :=  0 
SEQ  i  =  0  FOR  6 
SEQ 

total [i]  :=  0 
grand[i]  :=  0 
clock  ?  start 
write . int (screen, start,  12) 
newline (screen) 

WHILE  more 
PRI  ALT 

up. in  ?  command. byte;  bytel;  byte2 
IF 

(( (INT (command. byte) )  >  (39))  AND 
( (INT (command. byte) )  <  (46))) 
SEQ 

--  command  byte,  get  next  two  parameters 
up. out  !  command. byte;  bytel;  byte2 
--  update  appropriate  command  counter 
counter [to. slot [INT (bytel) ] ] 
[ (INT (command. byte) ) -40]  := 
counter [to. slot [INT (bytel) ] ] 
[ (INT (command. byte) ) -40]  +  1 

(( (INT (command. byte) )  >  (99))  AND 
( (INT (command. byte) )  <  (132))) 
--  a  tram  has  sent  a  notice  it  is  done  xmit 
SEQ 

clock  ?  slot. temp 

user .done [ ( (INT (command. byte) ) -100) ]  := 

(  (slot. temp-start) *64) /lOOO 
done. count  :=  done. count  +  1 
IF 

done. count  =  (no. trams  *  2) 
SEQ 

write . full . string (screen, 

"Test  Complete!") 
newline (screen) 
clock  ?  stop 

elapsed  :=  (  (stop-start ) *64) /lOOO 
write. full. string (screen, 

"Elapsed  Time:  ") 
write . int (screen,  elapsed,  12) 
write. full . string (screen, 

"    Block  Size:  ") 
write . int (screen,  INT (block. size), 12) 

otherwise 
SKIP 
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otherwise 
SEQ 
SKIP 

)ceyboard  ?  key.  in 
SEQ 

--  handle  keyboard  inputs 
IF 

(key. in  =  81)  OR  (key. in  =  113)  —  "Q":  quit 
SEQ 

more  :=  FALSE 

up. out  !  terminate;  terminate;  terminate 

(key. in  =  80)  OR  (key. in  =  112)  —  "P":  pause 
SEQ 

newline (screen) 

write . full . string (screen, "press  any  key.") 
keyboard  ?  key. in 

(key. in  =  82)  OR  (key. in  =  114) 
--  "R" :  resend  a  byte 
up. out  !  BYTE(60) 

(key. in  =  83)  OR  (key. in  =  115) 
--  "S":  show  summary 
SEQ 

newline (screen) 

newline (screen) 

sumtot  :=  0 

grandtot  :=  0 

write . full . string (screen, "  ") 

write . full . string (screen, 

"link.req     link.ack     link.brk") 
write . full . string (screen, 

link.abt     link.rei  ") 
newline (screen) 

SEQ  i  =  0  FOR  6 
SEQ 

grand [i]  :=  0 
total[i]  :=  0 

SEQ  i  =  0  FOR  no. trams 
SEQ 

write. int (screen, i, 6) 
write . int (screen, counter [i] [0] 
write . int (screen, counter [ i] [1] 
write . int (screen, counter [i] [3] 
write . int (screen, counter [i] [4] 
write . int (screen, counter [i] [5] 
write . int (screen, user. done [i ] , 12) 
newline (screen) 
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,12) 

,12) 

,12) 

,12) 

,12) 

SEQ  j  =  0  FOR  6 
SEQ 

total[j] : =total [ j] +counter [i] [j] 
grand[j]  :  =granci  [  j  ] +counter  [i]  [j] 
sumtot  :=  sumtot  +  counter [i] [ j] 
grandtot :=grandtot+ counter [i] [ j ] 

write. full . string (screen, "Totals") 
write. int (screen, total [0]  ,  12) 
write . int (screen, total [1]  ,  12) 
write . int (screen, total [3]  ,  12) 
write. int (screen, total [4]  ,  12) 
write. int (screen, total [5]  ,  12) 
write . int (screen, sumtot,  12) 
newline (screen) 
newline (screen) 

sumtot  :=  0 
SEQ  i  =  0  FOR  6 
total[i]  :=  0 
SEQ  i  =  20  FOR  no. trams 
SEQ 

write . int (screen, i, 6) 
write. int (screen, counter [i] [0] 
write. int (screen, counter [i] [1] 
write. int (screen, counter [i] [3] 
write . int (screen, counter [i] [4] 
write . int (screen, counter [i] [5] 
write . int (screen, user .done [i]  ,  12) 
newline (screen) 
SEQ  j  =  0  FOR  6 
SEQ 

total [j] :=total [ j] ^counter [i] [ j] 
grand[j] : =grand [ j ] +counter [ i] [ j] 
sumtot  :=  sumtot  +  counter [i] [ j ] 
grandtot : =grandtot+counter [i]  [ j  ] 

write. full . string (screen, "Totals") 
write . int (screen, total [0] , 12) 
write . int (screen, total [1 ] , 12) 
write. int (screen, total  [3] , 12) 
write . int (screen, total [4] , 12) 
write . int (screen, total [5] , 12) 
write . int (screen, sumtot, 12) 
sumtot  :=  0 
newline (screen) 
newline  (screen) 

write. full. string (screen, "Grands") 
write . int (screen, grand [0] , 12) 
write . int (screen, grand [ 1 ] , 12) 
write . int (screen, grand [3] , 12) 
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,12) 

,12) 

,12) 

,12) 

,12) 

write. int (screen, grand [4]  ,  12) 
write . int (screen, grand [5] , 12) 
write . int (screen, grandtot ,  12) 


otherwise  —  any  key,  ignored 

SKIP 


end  of  TIMER 


--  BUFFER  --  variable  buffer  on  pipe 

PROC  buffer  (CHAN  OF  ANY  into. buffer,  outof .buf f er! 


VAL  INT  buffer. size   IS  3  : 

BYTE  t.in,  s.in,  d.in,  t.out,  s.out,  d.out  : 

[buffer. size-1] CHAN  OF  ANY  b: 

[buffer. size-2]BYTE  t,  s,  d  : 

[buffer . size-2] BOOL  buffer. on  : 

BOOL  first .buffer . on,  last .buffer . on  : 

VAL  terminate  IS  199  (BYTE) : 

VAL  BOOL  otherwise   IS   TRUE: 

PAR 
SEQ 

first .buffer .on  :=  TRUE 
WHILE  first .buffer .on 
SEQ 

into. buffer  ?  t.in;  s.in;  d.in 

b[0]  !  t.in;  s.in;  d.in 

IF 

t . in  =  terminate 

first .buffer .on  :=  FALSE 
otherwise 
SKIP 
PAR  --  replicated  PAR  to  desired  buffer  size 

PAR  p  =  0  FOR  (buffer. size-2) 
SEQ 

buffer. on [p]  :=  TRUE 
WHILE  buffer.on[p] 
SEQ 

b[p]    ?   t[p];    s[p];    d[p] 
b[p+l]     !    t[p];    s[p];    d[p] 
IF 

t [p]  =  terminate 

buffer. on [p]  :=  FALSE 
otherwise 
SKIP 
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SEQ 

last .buffer .on  :=  TRUE 
WHILE  last. buffer. on 
SEQ 

b [buffer . size-2]  ?  t.out;  s.out;  d.out 
outof. buffer  !  t.out;  s.out;  d.out 
IF 

t.out  =  terminate 

last .buffer .on  :=  FALSE 
otherwise 
SKIP 
--  end  of  buffer 


--  BUFFER  --  variable  buffer  on  pipe 

PROC  buffer  (CHAN  OF  ANY  into. buffer,  outof .buffer) 


VAL  INT  buffer. size   IS  3  : 
BYTE  byte. in,  byte. out  : 
[buffer. size-1] CHAN  OF  ANY  b: 
[buffer. size-2]BYTE  byte. hold  : 
[buffer. size-2]B00L  buffer. on  : 
BOOL  first .buffer . on,  last .buffer . on 
VAL  terminate  IS  199  (BYTE) : 
VAL  BOOL  otherwise   IS   TRUE: 


PAR 

SEQ 

first. 

.buffer. on  :=  TRUE 

WHILE 

first .buffer .on 

SEQ 

into. buffer  ?  byte. in 

b[0]  !  byte. in 

IF 

byte. in  =  terminate 

first .buffer . on  := 

FALSE 

otherwise 

SKIP 

PAR 

PAR  p 

=  0  FOR  (buffer. size- 

■2) 

SEQ 

buffer. on [p]  :=  TRUE 

WHILE  buffer. on [p] 

SEQ 

b[p]  ?  byte.hold[p] 

b[p+l]  !  byte.hold[p] 

IF 

byte.hold[p]  =  terminate 
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buffer. on [p]  :=  FALSE 
otherwise 
SKIP 

SEQ 

last .buffer .on  :=  TRUE 
WHILE  last. buffer. on 
SEQ 

b [buffer . size-2]  ?  byte. out 
outof. buffer  !  byte. out 
IF 

byte. out  =  terminate 

last .buffer. on  :=  FALSE 
otherwise 
SKIP 
:  --  end  of  buffer 

PAR 

timer (keyboard,    screen,    up. in,    echo .to .buffer,    no. trams) 
buffer (echo . to .buffer,    up. out) 
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APPENDIX    C 
MESSAGE    EXCHANGE    ACTIVITY    DISPLAY 


—  STATUSdb  -  B004  Status  of  B012  activity  in  MSGX 
--  To  display  activity  of  message  exchange 

#USE  userio 
#USE  "links. tsr" 

CHAN  OF  ANY  up. in,  up. out,  echo . to .buffer, 
buffer. to .echo,  shut. down  : 

--  place  links  to  intercept  link  control  pipe  commands 
PLACE  up. in       AT  link2in   : 
PLACE  up. out       AT  linkSout  : 


--  BUFFER. IN  --  variable  buffer  on  pipe  before  status 
--  allows  B012  to  process  pipe  w/o  backup 

PROC  buffer. in  (CHAN  OF  ANY  into. buffer,  outof . buf f er, 

shutdown) 

CHAN  OF  ANY  to. buffer  : 

VAL  INT  buffer. size   IS  25  :    --  number  of  buffer  stages 

BYTE  byte. in,  byte. out,  byte.sd  : 

[buf fer. size-1] CHAN  OF  ANY  b:   --  array  of  channels 

[buf fer. size-2] BYTE  byte. hold  :--  array  of  command  bytes 

[buf fer . size-2] BOOL  buffer. on  :--  array  for  shutdown 

BOOL  first .buf fer . on,  last .buf fer . on,  shutdown . active  : 
VAL  terminate  IS  199  (BYTE) :  —  constant  for  shutdown 
VAL  BOOL  otherwise   IS   TRUE: 

PAR 

—  get  next  input  from  B012  or  from  B004  for  shutdown 
PAR 
SEQ 

shutdown. active  :=  TRUE 
WHILE  shutdown. active 
ALT 

shutdown  ?  byte.sd 
SEQ 

to. buffer  !  byte.sd 


132 


shutdown . active  :=  FALSE 
into. buffer  ?  byte.sd 
to. buffer  !  byte.sd 

--  place  either  in  first  buffer  space, 

--terminate  process  if  shutdown 
SEQ 

first .buffer. on  :=  TRUE 
WHILE  first .buffer. on 
SEQ 

to. buffer  ?  byte. in 

b[0]  !  byte. in 

IF 

byte. in  =  terminate 

first .buffer .on  :=  FALSE 
otherwise 
SKIP 
—  replicated  PAR  for  multi  stage  buffer. 

— Each  terminated  upon  shutdown 
PAR 

PAR  p  =  0  FOR  (buffer. size-2) 
SEQ 

buffer. on [p]  :=  TRUE 
WHILE  buffer. on [p] 
SEQ 

b[p]    ?   byte.hold[p] 
b[p+l]     !    byte.hold[p] 
IF 

byte.hold[p]  =  terminate 

buffer. on [p]  :=  FALSE 
otherwise 
SKIP 

--  final  buffer  stage  for  delivery  to  STATUS  process 
SEQ 

last .buffer .on  :=  TRUE 
WHILE  last .buffer. on 
SEQ 

b [buffer . size-2]  ?  byte. out 
outof. buffer  !  byte. out 
IF 

byte. out  =  terminate 

last .buffer. on  :=  FALSE 
otherwise 
SKIP 
:  --  end  of  buffer. in 

PROC  statusdb  (CHAN  OF  INT  keyboard, 

CHAN  OF  ANY  screen,  up. in,  up. out,  shutdown, 
VAL  INT  no. trams) 
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#USE  "intcmds.tsr" 

#USE  "ctrlcmds.tsr" 

VAL  terminate   IS  199  (BYTE) : 

BYTE  command. byte,  bytel,  byte2,  link. byte  : 

BOOL  more,  send. quit  : 

TIMER  Clock  : 

[17] INT  num. used  :  --  array  for  #  of  links 

[37] [6] INT  counter  :  --  array  for  #  of  commands 

[6] INT  total,  grand  :         —  totals  and  subtotals 

[36] INT  user. done  :  —  track  TRAM  USER  done 

INT  key. in,  done. count,  sum,  sumtot,  grandtot  : 

INT  num.acks,  num.reqs,  num.brks,  num.abts,  num.reis  : 

INT  line.num,  xpos.inc,  asci.inc,  offset  : 

INT    linkl,  link2,  links. used  : 

INT  start,  stop,  elapsed,  delay,  now  : 


--  translate  input  byte  from  BYTE 
--  to  english  quivelant  for  display 

PROC  xlate.byte  (CHAN  OF  ANY  screen,  VAL  BYTE  x.byte, 

VAL  INT  xpos) 

#USE  "intcmds.tsr" 
#USE  "ctrlcmds.tsr" 
SEQ 

goto . xy (screen, xpos, 0) 
IF 

--  bytes  0-31  passed  correspond  to  C004  link  nos. 
x.byte  =  (BYTE   0) 

write . full . string  (screen,  "-Slot   2  Link  3   ") 
x.byte  =  (BYTE   1) 

write . full . string  (screen,  "-Slot   5  Link  3   ") 
x.byte  =  (BYTE   2) 

write . full . string  (screen,  "-Slot   1  Link  3   ") 
x.byte  =  (BYTE   3) 

write. full . string  (screen,  "-Slot   5  Link  0   ") 
x.byte  =  (BYTE   4) 

write. full . string  (screen,  "-Slot   2  Link  0   ") 
x.byte  =  (BYTE   5) 

write. full . string  (screen,  "-Slot   1  Link  0   ") 
x.byte  =  (BYTE   6) 

write. full . string  (screen,  "-Slot   6  Link  0   ") 
x.byte  =  (BYTE   7) 

write. full. string  (screen,  "-Slot   6  Link  3   ") 
x.byte  =  (BYTE   8) 

write. full . string  (screen,  "-Slot  15  Link  0   ") 
x.byte  =  (BYTE   9) 

write. fulliString  (screen,  "-Slot   8  Link  0   ") 
x.byte  =  (BYTE  10) 
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write. full. string 
x.byte  =  (BYTE  11) 

write . full . string 
x.byte  =  (BYTE  12) 

write . full . string 
x.byte  =  (BYTE  13) 

write . full . string 
x.byte  =  (BYTE  14) 

write . full . string 
x.byte  =  (BYTE  15) 

write . full . string 
x.byte  =  (BYTE  16) 

write. full . string 
x.byte  =  (BYTE  17) 

write . full . string 
x.byte  =  (BYTE  18) 

write . full . string 
x.byte  =  (BYTE  19) 

write. full . string 
x.byte  =  (BYTE  20) 

write . full . string 
x.byte  =  (BYTE  21) 

write . full . string 
x.byte  =  (BYTE  22) 

write . full . string 
x.byte  =  (BYTE  23) 

write . full . string 
x.byte  =  (BYTE  24) 

write. full. string 
x.byte  =  (BYTE  25) 

write . full . string 
x.byte  =  (BYTE  26) 

write . full . string 
x.byte  =  (BYTE  27) 

write . full . string 
x.byte  =  (BYTE  28) 

write . full . string 
x.byte  =  (BYTE  29) 

write . full . string 
x.byte  =  (BYTE  30) 

write . full . string 
x.byte  =  (BYTE  31) 

write . full . string  (screen,  "-Slot  10  Link  0 
--  translation  for  link  control  commands  and  NX 
x.byte  =  byte. nil 

write . full . string  (screen,  "-Nil  Byte 
x.byte  =  link.ack 

write . full . string 
x.byte  =  link.rel 

write . full . string 
x.byte  =  link.req 


screen,  "-Slot  15  Link  3 

screen,  "-Slot  11  Link  3 

screen,  "-Slot  12  Link  0 

screen,  "-Slot  12  Link  3 

screen,  "-Slot   8  Link  3 

screen,  "-Slot  11  Link  0 

screen,  "-Slot   7  Link  0 

screen,  "-Slot   3  Link  3 

screen,  "-Slot   4  Link  0 

screen,  "-Slot   3  Link  0 

screen,  "-Slot   7  Link  3 

screen,  "-Slot   4  Link  3 

screen,  "-Slot   0  Link  0 

screen,  "-Slot   0  Link  3 

screen,  "-Slot  14  Link  3 

screen,  "-Slot  13  Link  0 

screen,  "-Slot  14  Link  0 

screen,  "-Slot  13  Link  3 

screen,  "-Slot   9  Link  0 

screen,  "-Slot   9  Link  3 

screen,  "-Slot  10  Link  3 


screen,  "-link.ack 
screen,  "-link.rel 
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write . full . string  (screen,  "-link.req        ") 
x.byte  =  link.brk 

write . full . string  (screen,  "-link.brk        ") 
x.byte  =  link.abt 

write . full . string  (screen,  "-link.abt        ") 
x.byte  =  link.rei 

write . full . string  (screen,  "-link.rei        ") 
--  miscellaneous  signals  used  in  control  &  testing 
x.byte  =  failed 

write . full . string  (screen,  "-failed  ") 

x.byte  =  reinit 

write . full . string  (screen,  "-reinit  ") 

x.byte  =  done. with. link 

write. full . string  (screen,  "-done. with. link   ") 
x.byte  =  all. done 

write . full . string  (screen,  "-all. done        ") 
x.byte  =  ready. to. rev 

write . full . string  (screen,  "-ready . to . rev     ") 
x.byte  =  link. made 

write . full . string  (screen,  "-link. made       ") 
x.byte  =  link. gone 

write . full . string  (screen,  "-link. gone       ") 
x.byte  =  (BYTE  199) 

write . full . string  (screen,  "-terminated  echo  ") 

--  display  TRAM  DONE  notices  when  received 
(( (x.byte) >(BYTE (99) ) )  AND  ( (x .byte) <  (BYTE  (136) )) ) 
SEQ 

goto.xy (screen, 0,17) 
write . full . string  (screen,  "-Tram") 
write. int (screen,  (INT (x.byte) ) -100, 3) 
write . full . string  (screen,  "  done    ") 
newline (screen) 

--  display  BAD  EDGE  notices  when  received 
((  (x.byte) >(BYTE(199) ) )  AND  (  (x .byte) < (BYTE (216) )) ; 
SEQ 

goto.xy (screen, 20,20) 
write . full . string  (screen,  "-Edge") 
write. int (screen, (INT (x.byte) )-200, 3) 
write. full . string  (screen,  "  bad    ") 
newline (screen) 

otherwise 

--  if  unknown  byte  not  successfully  translated 
SEQ 

goto.xy (screen, 20,18) 

write. full. string (screen, "==UNKNOWN  BYTE==    "] 
write. int (screen, INT (x.byte) , 6) 
end  of  xlate  byte 
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--  PROC  handle . screen   --  for  repeated  screen 
updates  for  various  commands  recieved 

PROC  handle. screen  (VAL  BYTE  link.byte, 

VAL  INT  linkl,  link2,  Inl,  ln2) 


SEQ 


--  place  the  first  byte  on  screen  (source) 

TIT 
X  i. 

linkl  <  16  —  a  link  0 

SEQ 

line.num  :=  Inl 
xpos.inc  :=  0 
otherwise  --  a  link  3 

SEQ 

line.num  :=  ln2 
xpos.inc  :=  20 

goto . xy (screen, ( (linkl-xpos .inc)*3)+14,line. num) 
write. char (screen, link .byte) 

--  place  the  second  byte  on  screen  (destination) 
IF 


—  a  link  0 


--  a  link  3 


link2  <  16 
SEQ 

line.num  :=  Inl 
xpos.inc  :=  0 
otherwise 
SEQ 

line.num  :=  ln2 
xpos.inc  :=  20 


goto . xy  (screen,  (  (link2-xpos . inc) *3) +14, line .num) 
write . char (screen, link. byte) 
—  end  of  handle . screen 


--  main  code  for  B004  monitor 


SEQ 

--  initialize  loop 

—  initialize  all  counters  used 

num.acks  :=  0 

num.reqs  :=  0 

num.brks  :=  0 

num.abts  :=  0 

num.reis  :=  0 

more  :=  TRUE 

done. count  :=  0 

links. used  :=  0 
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SEQ  i  =  0  FOR  3  6 

user . done [i]  :=  0 
delay  :=  32768 
SEQ  i  =  0  FOR  17 

num.used[i]  :=  0 

--  initialize  display  format 
SEQ  i  =  0  FOR  2  0 

newline (screen) 
goto. xy (screen, 0,1) 

write. full. string (screen, "Slot  Desig  :   A  B   C   D   E") 
write. full . string (screen, 

"FGHIJKLMNOP") 
newline (screen) 

write. full. string(screen, "Slot  #      :") 
SEQ  i  =  0  FOR  16 

write. int (screen, i, 3) 

--  intitiallize  display  columns 

newline (screen) 

newline (screen) 

write. full. string (screen, "Link  0  REQ  :") 

newline (screen) 

write. full . string (screen, "Link  3  REQ  :") 

newline (screen) 

newline (screen) 

write. full . string (screen, "Link  0  ACK  :") 

newline (screen) 

write . full . string (screen, "Link  3  ACK  :") 

newline (screen) 

newline (screen) 

write. full. string (screen, "Link  0  BRK  :") 

newline (screen) 

write . full . string (screen, "Link  3  BRK  :") 

newline (screen) 

newline (screen) 

write. full. string (screen,  "Link  0  ABT  :") 

newline (screen) 

write . full . string (screen,  "Link  3  ABT  :") 

newline (screen) 

newline (screen) 

write . full . string (screen, "Link  0  REI  :") 

newline (screen) 

write . full . string (screen, "Link  3  REI  :") 

--  initialize  edge  status  display 
goto. xy (screen, 0,21) 

write . full . string (screen, "Edge  #      :") 
SEQ  i  =  0  FOR  16 

write. int (screen,  i,  3) 
newline (screen) 
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goto. xy  (screen, 13,22) 
SEQ  i  =  0  FOR  16 

write . full . string (screen,  "  +  ") 

--  initialize  #  of  edges  used  display 
goto . xy (screen, 64,0) 

write . full . string (screen, "#Used  Count") 
SEQ  i  =  1  FOR  16 
SEQ 

goto . xy (screen, 65, i) 

write . int (screen,  i,  3) 

WHILE  more 
SEQ 

up. in  ?  command. byte   --  get  next  byte  in  pipe 
IF 

--  take  appropriate  display  and  counter  action 
--  for  each  command  received  from  B012 
(( (INT (command. byte) )  >  (39))  AND 
( (INT (command. byte) )  <  (46))) 
SEQ 

--  command  byte,  get  next  two  parameters 
up. in  ?  bytel;  byte2 
up. out  !  command. byte;  bytel;  byte2 
counter [to. slot [INT (bytel) ] ] 
[ (INT (command. byte) ) -40]  := 
counter [to. slot [INT (bytel) ] ] 
[ (INT (command. byte)  ) -40]  +  1 

xlate .byte (screen, command. byte,  0) 
xlate .byte (screen, bytel, 15) 
xlate .byte (screen, by te2,  35) 

--  determine  characters  used  on  screen  for 

display  of  source  &  dest 
IF 

( (INT  (bytel) )  <  32) 

linkl  :=  to. slot [INT (bytel) ] 
otherwise 

linkl  :=  10 

IF 

(  (INT(byte2) )  <  32) 

link2  :=  to . slot [ INT (byte2) ] 
otherwise 
link2  :=  30 

IF 

linkl  <  16 


139 


link. byte  :=  BYTE (linkl+65) 
otherwise 

link. byte  :=  BYTE (linkl+77) 

IF 

command. byte  =  link.req 
SEQ 

handle. screen (link. byte, linkl, 

link2,4,5) 
num.reqs  :=  num.reqs  +  1 
goto. xy (screen,  4,6) 
write . int (screen, num. reqs, 8) 

command. byte  =  link.ack 
SEQ 

handle. screen (link. byte,  linkl, 

link2,7, 8) 
num.acks  :=  num.acks  +  1 
goto. xy (screen,  4,9) 
write . int (screen, num. acks, 8) 
links. used  :=  links. used  +  1 
num. used [links .used]  := 

num. used [links .used]  +  1 
goto . xy (screen, 70, links. used) 
write . int (screen, 

num. used [links .used] , 6) 

command. byte  =  link.brk 
SEQ 

handle .screen (link. byte, linkl, 

link2, 10, 11) 
handle. screen (45 (BYTE) , linkl, 

link2,7, 8) 
num.brks  :=  num.brks  +  1 
goto. xy  (screen,  4,12) 
write . int (screen, num.brks, 8) 
links. used  :=  links. used  -  1 

command. byte  =  link.abt 
SEQ 

handle . screen (link. byte, linkl, 

link2, 13,14) 
handle. screen (63 (BYTE) , linkl, link2, 7, 8) 
links. used  :=  links. used  -  1 
num.abts  :=  num.abts  +  1 
goto. xy (screen,  4,15) 
write . int (screen, num. abts, 8) 
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command. byte  =  link.rei 
SEQ 

handle . screen (link. byte, linkl, 

link2, 16, 17) 
handle. screen (33 (BYTE) , linkl, link2, 7, 8) 
num.reis  :=  num.reis  +  1 
goto. xy (screen,  4,18) 
write . int (screen, num. abts, 8) 

otherwise   —  including  link.rel's 
SKIP 

—  handle  various  user  keystrokes  for 

stepping  speed,  pause,  etc. 
clock  ?  now 
PRI  ALT 

keyboard  ?  key. in 
IF 

(key. in  =  81)  OR  (key. in  =  113) 
--  "Q":  quit 
SEQ 

send. quit  :=  TRUE 
more  :=  FALSE 
shutdown  !  terminate 

(key. in  =  43) 
—  '•  +  "  =  faster 
delay  :=  delay  /  2 

(key. in  =  45) 
--  "-"  =  slower 
delay  :=  delay  *  2 


(key. in  =  82)  OR  (key. in  =  114) 
__  iij^ti .  j-esend  a  byte 
up. out  !  BYTE (60) 

otherwise 

send. quit  :=  FALSE 

clock  ?  AFTER  now  PLUS  (delay  *  100) 
SKIP 

—  handle  TRAM  (USER)  DONE  notices 
(( (INT (command. byte) )  >  (99))  AND 
( (INT (command. byte) )  <  (116))) 
--  a  tram  has  sent  a  notice  it  is  done 

xmitting  on  link0&3 
SEQ 
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goto.xy (screen, 13+ ( ( (INT (command. byte) ) - 

100) *3) ,A) 
write. char  (screen, 2 (BYTE) ) 

( ( (INT(command.byte) )  >  (119))  AND 
( (INT (command. byte) )  <  (136))) 
--  a  tram  has  sent  a  notice  it  is  done 

xmitting  on  lin)c0&3 
SEQ 

goto. xy (screen, 13+ ( ( (INT (command. byte) ) - 

120)*3),5) 
write. char (screen, 2 (BYTE) ) 

—  handle  EDGE  BAD  info  from  T212  (via  input  buf) 
( ( (INT(command.byte) )  >  (199))  AND 
( (INT (command. byte) )  <  (216))) 
--  an  edge  status  has  changed  to  GOOD 
SEQ 

goto . xy (screen, 14+ ( ( (INT (command. byte) ) - 

200) *3) ,22) 
write . char (screen, 43 (BYTE) )  --  change  to  "+" 

(( (INT (command. byte) )  >  (219))  AND 
( (INT (command. byte) )  <  (236))) 
--  an  edge  status  has  changed  to  BAD 
SEQ 

goto . xy (screen, 14+ ( ( (INT (command. byte) ) - 

220) *3) ,22) 
write. char (screen, 45 (BYTE) )  --  change  to  "-" 
otherwise 

--  handle  unknown  bytes  received  (for  testing) 
SEQ 
IF 

command. byte  =  terminate 

up. out  !  terminate 
otherwise 

xlate .byte (screen, command. byte, 0) 

—  terminate  all  processes  in  this  monitoring  program, 

--  including  all  replicated  PARs  used  in  buffers 

IF 

send. quit  =  TRUE 

SEQ   --  flush  queues 

WHILE  (command. byte  <>  terminate) 

up. in  ?  command. byte 
up. out  !  terminate 
otherwise 
SKIP 
end  of  B004  Monitor 
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—  BUFFER. OUT  —  variable  buffer  on  pipe 

PROC  buffer. out  (CHAN  OF  ANY  into. buffer,  outof .buffer) 

VAL  INT  buffer. size   IS  3  : 
BYTE  byte. in,  byte. out  : 

[buffer. size-1] CHAN  OF  ANY  b: 

[buffer. size-2]BYTE  byte. hold  : 

[buffer. size-2] BOOL  buffer. on  : 
BOOL  first .buffer . on,  last .buffer . on  : 
VAL  terminate  IS  199  (BYTE) : 
VAL  BOOL  otherwise   IS   TRUE: 


PAR 

SEQ 

first, 

.buffer. on  :=  TRUE 

WHILE 

first .buffer .on 

SEQ 

into. buffer  ?  byte. in 

b[0]  !  byte. in 

IF 

byte. in  =  terminate 

first .buffer .on  := 

FALSE 

otherwise 

SKIP 

PAR 

PAR  p 

=  0  FOR  (buffer. size- 

2) 

SEQ 

buffer. on [p]  :=  TRUE 

WHILE  buffer. on[p] 

SEQ 

b[p]  ?  byte.hold[p] 

b[p+l]  !  byte.hold[p] 

IF 

byte.hold[p]  =  terminate 

buffer. on[p]  := 

=  FALSE 

otherwise 

SKIP 

SEQ 

last .buffer .on  :=  TRUE 
WHILE  last. buffer. on 
SEQ 

b [buffer . size-2]  ?  byte. out 
outof. buffer  !  byte. out 
IF 

byte. out  =  terminate 

last .buffer .on  :=  FALSE 
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otherwise 
SKIP 
:  --  end  of  buffer. out 

PAR 

buffer, in (up. in,  buffer .to . echo,  shut. down) 
statusdb (keyboard,  screen,  buffer .to. echo, 

echo. to. buffer,  shut. down,  no. trams) 
buffer. out (echo. to .buffer,  up. out) 
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